
Li Liu- University of Oulu
Li Liu
- University of Oulu
About
124
Publications
46,533
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,547
Citations
Introduction
Current institution
Publications
Publications (124)
The absence of publicly available, large-scale, high-quality datasets for Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) has significantly hindered the application of rapidly advancing deep learning techniques, which hold huge potential to unlock new capabilities in this field. This is primarily because collecting large volumes of...
Xinyi Ying Chao Xiao Wei An- [...]
Li Liu
Visible-thermal small object detection (RGBT SOD) is a significant yet challenging task with a wide range of applications, including video surveillance, traffic monitoring, search and rescue. However, existing studies mainly focus on either visible or thermal modality, while RGBT SOD is rarely explored. Although some RGBT datasets have been develop...
While deep learning excels in computer vision tasks with abundant labeled data, its performance diminishes significantly in scenarios with limited labeled samples. To address this, Few-shot learning (FSL) enables models to perform the target tasks with very few labeled examples by leveraging prior knowledge from related tasks. However, traditional...
While recent debiasing methods for Scene Graph Generation (SGG) have shown impressive performance, these efforts often attribute model bias solely to the long-tail distribution of relationships, overlooking the more profound causes stemming from skewed object and object pair distributions. In this paper, we employ causal inference techniques to mod...
Despite the remarkable progress in synthetic aperture radar automatic target recognition (SAR ATR), recent efforts have concentrated on detecting and classifying a specific category, e.g. , vehicles, ships, airplanes, or buildings. One of the fundamental limitations of the top-performing SAR ATR methods is that the learning paradigm is supervised,...
Few-shot Class-Incremental Learning (FSCIL) presents a unique challenge in Machine Learning (ML), as it necessitates the Incremental Learning (IL) of new classes from sparsely labeled training samples without forgetting previous knowledge. While this field has seen recent progress, it remains an active exploration area. This paper aims to provide a...
Bias in Foundation Models (FMs) - trained on vast datasets spanning societal and historical knowledge - poses significant challenges for fairness and equity across fields such as healthcare, education, and finance. These biases, rooted in the overrepresentation of stereotypes and societal inequalities in training data, exacerbate real-world discrim...
Currently, reinforcement learning (RL) has been applied for the multi-target detection task of MIMO radar. However, the existing methods still have two shortcomings: 1) the detection performance on weak targets is insufficient, and 2) the solving time of beam optimization scheme (BOS) is long. For the first issue, this paper first proposes a partia...
Xinyi Ying Li Liu Zaipin Lin- [...]
Wei An
Multi-frame infrared small target (MIRST) detection in satellite videos is a long-standing, fundamental yet challenging task for decades, and the challenges can be summarized as: First, extremely small target size, highly complex clutters & noises, various satellite motions result in limited feature representation, high false alarms, and difficult...
Vision Transformers (ViTs) have demonstrated superior performance in various remote sensing tasks, such as Optical Remote Sensing Images Salient Object Detection (ORSI-SOD). However, the high resolution of remote sensing images and the substantial computational costs pose significant challenges for deploying existing methods on resource-constrained...
The goal of Few-Shot Continual Learning (FSCL) is to incrementally learn novel tasks with limited labeled samples and preserve previous capabilities simultaneously. However, current FSCL works lack research on domain increment and domain generalization ability, which cannot cope with changes in the visual perception environment. In this paper, we s...
The growing Synthetic Aperture Radar (SAR) data can build a foundation model using self-supervised learning (SSL) methods, which can achieve various SAR automatic target recognition (ATR) tasks with pretraining in large-scale unlabeled data and fine-tuning in small-labeled samples. SSL aims to construct supervision signals directly from the data, m...
The fundamental challenge in SAR target detection lies in developing discriminative, efficient, and robust representations of target characteristics within intricate non-cooperative environments. However, accurate target detection is impeded by factors including the sparse distribution and discrete features of the targets, as well as complex backgr...
Occlusion is a longstanding difficulty that challenges the UAV-based object detection. Many works address this problem by adapting the detection model. However, few of them exploit that the UAV could fundamentally improve detection performance by changing its viewpoint. Active Object Detection (AOD) offers an effective way to achieve this purpose....
Transferable targeted adversarial attacks (TTAs) against deep neural networks have been proven significantly more challenging than untargeted ones, yet they remain relatively underexplored. This paper sheds new light on performing highly efficient yet transferable targeted attacks leveraging the simple gradient-based baseline. Our research undersco...
Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) plays a pivotal role in civilian and military applications. However, the limited labeled samples present a significant challenge in deep learning-based SAR ATR. Few-shot learning (FSL) offers a potential solution, but models trained with limited samples may produce a high probability...
Currently, the contextual bandit (CB) has been applied in the field of slow-moving small target detection on sea surface, to solve the performance decline problem of feature detection method under less coherent pulse number. However, the existing method only gives a brief description on its implementation details, and also has two shortcomings: 1)...
Multi-frame infrared small target (MIRST) detection in satellite videos is a long-standing, fundamental yet challenging task for decades, and the challenges can be summarized as: First, extremely small target size, highly complex clutters & noises, various satellite motions result in limited feature representation, high false alarms, and difficult...
Synthetic aperture radar (SAR) automatic target recognition (ATR) is extensively applied in both military and civilian sectors. Nevertheless, test and training data distribution may differ in the open world. Therefore, SAR out-of-distribution (OOD) detection is important because it enhances the reliability and adaptability of SAR systems. However,...
Small object detection (SOD) has been a longstanding yet challenging task for decades, with numerous datasets and algorithms being developed. However, they mainly focus on either visible or thermal modality, while visible-thermal (RGBT) bimodality is rarely explored. Although some RGBT datasets have been developed recently, the insufficient quantit...
Chao Xiao Wei An Yifan Zhang- [...]
Li Liu
Moving object detection in satellite videos (SVMOD) is a challenging task due to the extremely dim and small target characteristics. Current learning-based methods extract spatio-temporal information from multi-frame dense representation with labor-intensive manual labels to tackle SVMOD, which needs high annotation costs and contains tremendous co...
Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five ye...
This article proposes a novel module called middle spectrum grouped convolution (MSGC) for efficient deep convolutional neural networks (DCNNs) with the mechanism of grouped convolution. It explores the broad “middle spectrum” area between channel pruning and conventional grouped convolution. Compared with channel pruning, MSGC can retain most of t...
With the rapid acquisition of high-resolution Synthetic Aperture Radar(SAR) images, new categories are continually observed with few-shot instances in openly non-cooperative scenarios. Powering a SAR Automatic Target Recognition (SAR ATR) system with an ability of few-shot class-incremental learning (FSCIL) is nontrivial. Observing the pronounced a...
The data distribution of synthetic aperture radar (SAR) vehicle targets in the actual missions is often imbalanced. However, the recent algorithms for SAR target recognition are designed either under abundant samples, or the situation of few labeled samples among all the categories. These cases all avoid facing the difficulties of imbalanced data d...
Achieving the automatic target recognition in synthetic aperture radar (SAR) imagery is a long-standing difficulty because of the limited training samples and its sensitivity to imaging condition. Active target recognition methods can offer an innovative perspective to improve the recognition accuracy compared to its passive counterpart. Although p...
Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) has ushered in a new era dominated by deep-learning (DL) techniques. However, the DL-based recognition systems inevitably confront
Catastrophic forgetting
for learned knowledge and
Overfitting
for the new, once deployed in openly dynamic scenarios where targets of new classes conti...
Current methods synthetic aperture radar–automatic target recognition (SAR-ATR) research methods still struggle with overfitting due to small amounts of training data, as well as black-box opacity and high computational requirements. Unmanned aerial vehicles (UAVs), as the mainstream means of acquiring SAR data, place higher requirements on ATR alg...
Recent research in remote sensing object detection (RSOD) has significantly advanced the development of vision foundation models. However, deploying these models on resource-constrained edge devices is challenging due to their high computational demands. Binarized detectors utilize binary neural networks (BNNs) to achieve extreme compression by qua...
Recently, there have been tremendous efforts in developing lightweight Deep Neural Networks (DNNs) with satisfactory accuracy, which can enable the ubiquitous deployment of DNNs in edge devices. The core challenge of developing compact and efficient DNNs lies in how to balance the competing goals of achieving high accuracy and high efficiency. In t...
Given a model well-trained with a large-scale base dataset, few-shot class-incremental learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples by avoiding overfitting, without catastrophically forgetting all encountered classes previously. Currently, semi-supervised learning technique that harnesses freely available...
LBP is a successful hand-crafted feature descriptor in computer vision. However, in the deep learning era, deep neural networks, especially convolutional neural networks (CNNs) can automatically learn powerful task-aware features that are more discriminative and of higher representational capacity. To some extent, such hand-crafted features can be...
This paper proposes a novel module called middle spectrum grouped convolution (MSGC) for efficient deep convolutional neural networks (DCNNs) with the mechanism of grouped convolution. It explores the broad "middle spectrum" area between channel pruning and conventional grouped convolution. Compared with channel pruning, MSGC can retain most of the...
Given a model well-trained with a large-scale base dataset, Few-Shot Class-Incremental Learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples by avoiding overfitting, without catastrophically forgetting all encountered classes previously. Currently, semi-supervised learning technique that harnesses freely-available...
Among the current methods of synthetic aperture radar (SAR) automatic target recognition (ATR), unlabeled measured data and labeled simulated data are widely used to elevate the performance of SAR ATR. In view of this, the setting of semi-supervised few-shot SAR vehicle recognition is proposed to use these two forms of data to cope with the problem...
Recent years have witnessed a remarkable breakthrough in Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) with the development of deep learning (DL). Nonetheless, once deployed, the DL-based methods’ ability to incrementally learn new knowledge from few-shot samples without forgetting the old is fragile, hindering them from discrimin...
By training first with a large base dataset, FewShot Class-Incremental Learning (FSCIL) aims at continually learning a sequence of few-shot learning tasks with novel classes. There are mainly two challenges in FSCIL: the overfitting issue of novel classes with limited labeled samples and the catastrophic forgetting of previously seen classes. The c...
Facial kinship verification refers to automatically determining whether two people have a kin relation from their faces. It has become a popular research topic due to potential practical applications. Over the past decade, many efforts have been devoted to improving the verification performance from human faces only while lacking other biometric in...
Developing lightweight Deep Convolutional Neural Networks (DCNNs) and Vision Transformers (ViTs) has become one of the focuses in vision research since the low computational cost is essential for deploying vision models on edge devices. Recently, researchers have explored highly computational efficient Binary Neural Networks (BNNs) by binarizing we...
Efficiency and robustness are increasingly needed for applications on 3D point clouds, with the ubiquitous use of edge devices in scenarios like autonomous driving and robotics, which often demand real-time and reliable responses. The paper tackles the challenge by designing a general framework to construct 3D learning architectures with SO(3) equi...
In recent years, deep learning has brought significant progress for the problem of synthetic aperture radar (SAR) target classification. However, SAR image characteristics are highly sensitive to the change of imaging conditions. The inconsistency of imaging parameters (especially the depression angle) leads to the distribution shift between the tr...
The goal of Facial Kinship Verification (FKV) is to automatically determine whether two individuals have a kin relationship or not from their given facial images or videos. It is an emerging and challenging problem that has attracted increasing attention due to its practical applications. Over the past decade, significant progress has been achieved...
Face recognition is one of the most active tasks in computer vision and has been widely used in the real world. With great advances made in convolutional neural networks (CNN), lots of face recognition algorithms have achieved high accuracy on various face datasets. However, existing face recognition algorithms based on CNNs are vulnerable to noise...
Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five ye...
p>Facial kinship verification refers to automatically determining whether two people have a kin relation from their faces. It has become a popular research topic due to potential practical applications, such as finding missing children, family photo organization, or criminal investigations. Over the past decade, many efforts have been devoted to im...
p>Facial kinship verification refers to automatically determining whether two people have a kin relation from their faces. It has become a popular research topic due to potential practical applications, such as finding missing children, family photo organization, or criminal investigations. Over the past decade, many efforts have been devoted to im...
Multi-view clustering has received increasing attention due to its effectiveness in fusing complementary information without manual annotations. Most previous methods hold the assumption that each instance appears in all views. However, it is not uncommon to see that some views may contain some missing instances, which gives rise to incomplete mult...
In recent years, deep learning has brought significant progress for the problem of Synthetic Aperture Radar (SAR) target classification. However, SAR image characteristics are highly sensitive to the change of imaging conditions. The inconsistency of imaging parameters (especially the depression angle) leads to the distribution shift between the tr...
Aircraft detection in Synthetic Aperture Radar (SAR) imagery is a challenging task in SAR Automatic Target Recognition (SAR ATR) areas due to aircraft's extremely discrete appearance, obvious intraclass variation, small size and serious background's interference. In this paper, a single-shot detector namely Attentional Feature Refinement and Alignm...
Unsupervised Domain Adaptation (UDA) aims at learning a classifier for an unlabeled target domain by transferring knowledge from a labeled source domain with a related but different distribution. The strategy of aligning the two domains in latent feature space via metric discrepancy or adversarial learning has achieved considerable progress. Howeve...
Yawen Cui Wanxia Deng Xin Xu- [...]
Li Liu
Class-Incremental Learning (CIL) aims at incrementally learning novel classes without forgetting old ones. This capability becomes more challenging when novel tasks contain one or a few labeled training samples, which leads to a more practical learning scenario,
i.e
., Few-Shot Class-Incremental Learning (FSCIL). The dilemma on FSCIL lies in seri...
Lip reading is the task of decoding text from speakers' mouth movements. Numerous deep learning-based methods have been proposed to address this task. However, these existing deep lip reading models suffer from poor generalization due to overfitting the training data. To resolve this issue, we present a novel learning paradigm that aims to improve...
Aircraft detection in Synthetic Aperture Radar (SAR) imagery is a challenging task in SAR Automatic Target Recognition (SAR ATR) areas due to aircraft’s extremely discrete appearance, obvious intraclass variation, small size and serious background’s interference. In this paper, a single-shot detector namely Attentional Feature Refinement and Alignm...
As is well-known, defects precisely affect the lives and functions of the machines in which they occur, and even cause potentially catastrophic casualties. Therefore, quality assessment before mounting is an indispensable requirement for factories. Apart from the recognition accuracy, current networks suffer from excessive computing complexity, mak...
Synthetic aperture radar (SAR) target recognition faces the challenge that there are very little labeled data. Although few-shot learning methods are developed to extract more information from a small amount of labeled data to avoid overfitting problems, recent few-shot or limited-data SAR target recognition algorithms overlook the unique SAR imagi...
Binary neural networks (BNNs) constrain weights and activations to +1 or -1 with limited storage and computational cost, which is hardware-friendly for portable devices. Recently, BNNs have achieved remarkable progress and been adopted into various fields. However, the performance of BNNs is sensitive to activation distribution. The existing BNNs u...
Face perception is an essential and significant problem in pattern recognition, concretely including Face Recognition (FR), Facial Expression Recognition (FER), and Race Categorization (RC). Though handcrafted features perform well on face images, Deep Convolutional Neural Networks (DCNNs) have brought new vitality to this field recently. Vanilla D...
Unsupervised Domain Adaptation aims to learn a classifier for an unlabeled target domain by transferring knowledge from a labeled source domain. Most existing approaches learn domain-invariant features by adapting the entire information of each image. However, forcing adaptation of domain-specific components can undermine the effectiveness of learn...
Few-Shot Learning (FSL) is a challenging and practical learning pattern, aiming to solve a target task which has only a few labeled examples. Currently, the field of FSL has made great progress, but largely in the supervised setting, where a large auxiliary labeled dataset is required for offline training. However, the unsupervised FSL (UFSL) probl...
Zhuo Su Wenzhe Liu Zitong Yu- [...]
Li Liu
Recently, deep Convolutional Neural Networks (CNNs) can achieve human-level performance in edge detection with the rich and abstract edge representation capacities. However, the high performance of CNN based edge detection is achieved with a large pretrained CNN backbone, which is memory and energy consuming. In addition, it is surprising that the...
Uncertainty quantification (UQ) methods play a pivotal role in reducing the impact of uncertainties during both optimization and decision making processes. They have been applied to solve a variety of real-world applications in science and engineering. Bayesian approximation and ensemble learning techniques are two of the most widely-used types of...
Convolutional neural networks have made great achievements in field of optical image classification during recent years. However, for Synthetic Aperture Radar automatic target recognition (SAR-ATR) tasks, the performance of deep learning networks is always degraded by the insufficient size of SAR images, which cause both severe over-fitting and low...
Recently, automatic pain assessment technology, in particular automatically detecting pain from facial expressions, has been developed to improve the quality of pain management, and has attracted increasing attention. In this paper, we propose self-supervised learning for automatic yet efficient pain assessment, in order to reduce the cost of colle...
In an era of ubiquitous large-scale evolving data streams, data stream clustering (DSC) has received lots of attention because the scale of the data streams far exceeds the ability of expert human analysts. It has been observed that high-dimensional data are usually distributed in a union of low-dimensional subspaces. In this article, we propose a...
State-of-the-art single depth image-based 3D hand pose estimation methods are based on dense predictions, including voxel-to-voxel predictions, point-to-point regression, and pixel-wise estimations. Despite the good performance, those methods have a few issues in nature, such as the poor trade-off between accuracy and efficiency, and plain feature...
Replacing normal convolutions with group convolutions can significantly increase the computational efficiency of modern deep convolutional networks, which has been widely adopted in compact network architecture designs. However, existing group convolutions undermine the original network structures by cutting off some connections permanently resulti...
Recent research treats radar emitter classification (REC) problems as typical closed-set classification problems, i.e., assuming all radar emitters are cooperative and their pulses can be pre-obtained for training the classifiers. However, such overly ideal assumptions have made it difficult to fit real-world REC problems into such restricted model...
Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in t...
The papers in this special section examine compact and efficient feature representation and learning in computer vision.
Facial Micro-Expressions (MEs) are spontaneous, involuntary facial movements when a person experiences an emotion but deliberately or unconsciously attempts to conceal his or her genuine emotions. Recently, ME recognition has attracted increasing attention due to its potential applications such as clinical diagnosis, business negotiation, interroga...
Texture is a fundamental characteristic of many types of images, and texture representation is one of the essential and challenging problems in computer vision and pattern recognition which has attracted extensive research attention over several decades. Since 2000, texture representations based on Bag of Words and on Convolutional Neural Networks...
Generic object detection, aiming at locating object instances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in recent years as powerful methods for learning feature representations directly from data, and have led to r...
A key problem within data mining is clustering of data streams. Most existing algorithms for data stream clustering are based on quite restrictive models for the cluster dynamics. In an attempt to overcome the limitations of existing methods, we propose a novel data stream clustering method, which we refer to as improved streaming affinity propagat...
Convolutional Neural Networks have achieved unprecedented successes in computer vision fields, but they remain challenged by the problem about how to effectively process the orientation transformation of objects with fewer parameters. In this paper, we propose a new convolutional module, Local Binary orientation Module (LBoM), which takes advantage...
Research in texture recognition often concentrates on recognizing textures with intraclass variations such as illumination, rotation, viewpoint and small scale changes. In contrast, in real-world applications a change in scale can have a dramatic impact on texture appearance, to the point of changing completely from one texture category to another....
Research in texture recognition often concentrates on recognizing textures with intraclass variations such as illumination, rotation, viewpoint and small scale changes. In contrast, in real-world applications a change in scale can have a dramatic impact on texture appearance, to the point of changing completely from one texture category to another....
Texture is a fundamental characteristic of many types of images, and texture representation is one of the essential and challenging problems in computer vision and pattern recognition which has attracted extensive research attention. Since 2000, texture representations based on Bag of Words (BoW) and on Convolutional Neural Networks (CNNs) have bee...
Texture is a fundamental characteristic of many types of images, and texture representation is one of the essential and challenging problems in computer vision and pattern recognition which has attracted extensive research attention. Since 2000, texture representations based on Bag of Words (BoW) and on Convolutional Neural Networks (CNNs) have bee...
In this paper, we propose a robust local descriptor for face recognition. It consists of two components, one based on a shearlet-decomposition and the other on local binary pattern (LBP). Shearlets can completely analyze the singular structures of piecewise smooth images, which is useful since singularities and irregular structures carry useful inf...
In recent years, a wide variety of different texture descriptors has been proposed, including many LBP variants. New types of descriptors based on multistage convolutional networks and deep learning have also emerged. In different papers the performance comparison of the proposed methods to earlier approaches is mainly done with some well-known tex...
This paper presents a local feature based shape matching algorithm for expression-invariant 3D face recognition. Each 3D face is first automatically detected from a raw 3D data and normalized to achieve pose invariance. The 3D face is then represented by a set of keypoints and their associated local feature descriptors to achieve robustness to expr...