
P C Yuen- Hong Kong Baptist University
P C Yuen
- Hong Kong Baptist University
About
266
Publications
36,143
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,927
Citations
Introduction
Current institution
Publications
Publications (266)
Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models with partitioned features of shared samples, without leaking private data. Recent research has shown promising results addressing various challenges in VFL, highlighting its potential for practical application...
Vertical Federated Learning (VFL) has emerged as a crucial privacy-preserving learning paradigm that involves training models using distributed features from shared samples. However, the performance of VFL can be hindered when the number of shared or aligned samples is limited, a common issue in mobile environments where user data are diverse and u...
Although federated learning (FL) has achieved outstanding results in privacy-preserved distributed learning, the setting of model homogeneity among clients restricts its wide application in practice. This article investigates a more general case, namely, model-heterogeneous FL (M-hete FL), where client models are independently designed and can be s...
Remote Photoplethysmography (rPPG) has been attracting increasing attention due to its potential in a wide range of application scenarios such as physical training, clinical monitoring, and face anti-spoofing. On top of conventional solutions, deep-learning approach starts to dominate in rPPG estimation and achieves top-level performance. However,...
Automatic lesion segmentation is important for assisting doctors in the diagnostic process. Recent deep learning approaches heavily rely on large-scale datasets, which are difficult to obtain in many clinical applications. Leveraging external labelled datasets is an effective solution to tackle the problem of insufficient training data. In this pap...
Coronavirus disease 2019 (COVID-19) has become a severe global pandemic. Accurate pneumonia infection segmentation is important for assisting doctors in diagnosing COVID-19. Deep learning-based methods can be developed for automatic segmentation, but the lack of large-scale well-annotated COVID-19 training datasets may hinder their performance. Sem...
Liver biopsy images play a key role in the diagnosis of global non-alcoholic fatty liver disease (NAFLD). The NAFLD activity score (NAS) on liver biopsy images grades the amount of histological findings that reflect the progression of NAFLD. However, liver biopsy image analysis remains a challenging task due to its complex tissue structures and spa...
Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. An fPAD model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not direct...
Automatic liver tumor segmentation could offer assistance to radiologists in liver tumor diagnosis, and its performance has been significantly improved by recent deep learning based methods. These methods rely on large-scale well-annotated training datasets, but collecting such datasets is time-consuming and labor-intensive, which could hinder thei...
Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to robustify the network against images perturbed by imperceptible adversarial noise. This pap...
Source-data-free unsupervised domain adaptation (SF-UDA) is an approach to improve model performance in the target domain without accessing the source data. Some SF-UDA methods have been proposed and achieved promising results using the information from source-model parameters. However, current research on information security confirms the ability...
Many unsupervised domain adaptation (UDA) methods have been developed and have achieved promising results in various pattern recognition tasks. However, most existing methods assume that raw source data are available in the target domain when transferring knowledge from the source to the target domain. Due to the emerging regulations on data privac...
Automatic liver tumor segmentation is of great importance for assisting doctors in liver cancer diagnosis and treatment planning. Recently, deep learning approaches trained with pixel-level annotations have contributed many breakthroughs in image segmentation. However, acquiring such accurate dense annotations is time-consuming and labor-intensive,...
Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. The generalization ability of face presentation attack detection models to unseen attacks has become a key issue for real-world deployment, which can be improved when models are trained with face images from different input distributions and dif...
Liver biopsy image analysis is the gold standard for early diagnosis of non-alcoholic fatty liver disease (NAFLD) worldwide. Deep neural networks offer an effective tool for image analysis. However, when applying deep learning methods to smaller histological image datasets, the model may be distracted by dominant normal tissues and ignore critical...
Developing a Universal Lesion Detector (ULD) that can detect various types of lesions from the whole body is of great importance for early diagnosis and timely treatment. Recently, deep neural networks have been applied for the ULD task, and existing methods assume that all the training samples are well-annotated. However, the partial label problem...
The development of multi-spectrum image sensing technology has brought great interest in exploiting the information of multiple modalities (e.g., RGB and infrared modalities) for solving computer vision problems. In this article, we investigate how to exploit information from RGB and infrared modalities to address two important issues in visual tra...
Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof...
Unsupervised domain adaptation is an effective approach to solve the problem of dataset bias. However, most existing unsupervised domain adaptation methods assume that the geometry structures of data distributions are similar in the source and target domains. This assumption is invalid in many practical applications, because the training and test d...
Objective
Accurate risk prediction is important for evaluating early medical treatment effects and improving health care quality. Existing methods are usually designed for dynamic medical data, which require long-term observations. Meanwhile, important personalized static information is ignored due to the underlying uncertainty and unquantifiable a...
With the advancement of 3D printing technologies, 3D mask presentation attack becomes a critical challenge in face recognition. To tackle the 3D mask presentation attack detection (PAD), remote Photoplethysmography (rPPG) is employed as an intrinsic detection cue which is independent of the mask material and appearance quality. Although the effecti...
Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to defend the network against images with imperceptible adversarial perturbations. In this pap...
Influenced by the dynamic changes in the severity of illness, patients usually take examinations in hospitals irregularly, producing a large volume of irregular medical time-series data. Performing diagnosis prediction from the irregular medical time series is challenging because the intervals between consecutive records significantly vary along ti...
Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to defend the network against images with imperceptible adversarial perturbations. In this pap...
Medical time series of laboratory tests has been collected in electronic health records (EHRs) in many countries. Machine-learning algorithms have been proposed to analyze the condition of patients using these medical records. However, medical time series may be recorded using different laboratory parameters in different datasets. This results in t...
Temporal cues in videos provide important information for recognizing actions accurately. However, temporal-discriminative features can hardly be extracted without using an annotated large-scale video action dataset for training. This paper proposes a novel Video-based Temporal-Discriminative Learning (VTDL) framework in self-supervised manner. Wit...
Deep embedding learning plays a key role in learning discriminative feature representations, where the visually similar samples are pulled closer and dissimilar samples are pushed away in the low-dimensional embedding space. This paper studies the unsupervised embedding learning problem by learning such a representation without using any category l...
It has been shown that face images can be reconstructed from their representations (templates). We propose a randomized CNN to generate protected face biometric templates given the input face image and a user-specific key. The use of user-specific keys introduces randomness to the secure template and hence strengthens the template security. To furt...
Low-dimensional and compact representation of time series data is of importance for mining and storage. In practice, time series data are vulnerable to various temporal transformations, such as shift and temporal scaling, however, which are unavoidable in the process of data collection. If a learning algorithm directly calculates the difference bet...
Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face anti-spoofing (FAS) model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) ar...
Face presentation attacks have become an increasingly critical concern when face recognition is widely applied. Many face anti-spoofing methods have been proposed, but most of them ignore the generalization ability to unseen attacks. To overcome the limitation, this work casts face anti-spoofing as a domain generalization (DG) problem, and attempts...
Person re-identification (Re-ID) has been widely studied by learning a discriminative feature representation with a set of well-annotated training data. Existing models usually assume that all the training samples are correctly annotated. However, label noise is unavoidable due to false annotations in large-scale industrial applications. Different...
Face presentation attacks have become an increasingly critical concern when face recognition is widely applied. Many face anti-spoofing methods have been proposed, but most of them ignore the generalization ability to unseen attacks. To overcome the limitation, this work casts face anti-spoofing as a domain generalization (DG) problem, and attempts...
Accurate prediction of mortality risk is important for evaluating early treatments, detecting high-risk patients and improving healthcare outcomes. Predicting mortality risk from the irregular clinical time series data is challenging due to the varying time intervals in the consecutive records. Existing methods usually solve this issue by generatin...
This paper addresses the variation generalized feature learning problem in unsupervised video-based person re-identification (re-ID). With advanced tracking and detection algorithms, large-scale intra-view positive samples can be easily collected by assuming that the image frames within the tracking sequence belong to the same person. Existing meth...
In unsupervised domain adaptation, distributions of visual representations are mismatched across domains, which leads to the performance drop of a source model in the target domain. Therefore, distribution alignment methods have been proposed to explore cross-domain visual representations. However, most alignment methods have not considered the dif...
Visible thermal person re-identification (VT-REID) is a task of matching person images captured by thermal and visible cameras, which is an extremely important issue in night-time surveillance applications. Existing cross-modality recognition works mainly focus on learning sharable feature representations to handle the cross-modality discrepancies....
This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approx...
With a large number of video surveillance systems installed for the requirement from industrial security, the task of object tracking, which aims to locate objects of interest in videos, is very important. Although numerous tracking algorithms for RGB videos have been developed in the decade, the tracking performance and robustness of these systems...
Cross-camera label estimation from a set of unlabelled training data is an extremely important component in unsupervised person re-identification (re-ID) systems. With the estimated labels, existing advanced supervised learning methods can be leveraged to learn discriminative re-ID models. In this paper, we utilize the graph matching technique for...
Shapelets are discriminative local patterns in time series, which maximally distinguish among different classes. Instead of considering full series, shapelet transformation considers the existence or absence of local shapelets, which leads to high classification accuracy, easy visualization and interpretability. One of the limitation of existing me...
In multimedia analysis, one objective of unsupervised visual domain adaptation is to train a classifier that works well on a target domain given labeled source samples and unlabeled target samples. Feature alignment of two domains is the key issue which should be addressed to achieve this objective. Inspired by the recent study of Generative Advers...
To intelligently analyze and understand video content, a key step is to accurately perceive the motion of the interested objects in videos. To this end, the task of object tracking, which aims to determine the position and status of the interested object in consecutive video frames, is very important, and has received great research interest in the...
3D mask face presentation attack, as a new challenge in face recognition, has been attracting increasing attention.
Recently, remote Photoplethysmography (rPPG) is employed as an intrinsic liveness cue which is independent of the mask appearance.
Although existing rPPG-based methods achieve promising results on both intra and cross dataset scenar...
3D mask face presentation attack, as a new challenge in face recognition, has been attracting increasing attention. Recently, remote Photoplethysmography (rPPG) is employed as an intrinsic liveness cue which is independent of the mask appearance. Although existing rPPG-based methods achieve promising results on both intra and cross dataset scenario...
This paper addresses the scalability and robustness issues of estimating labels from imbalanced unlabeled data for unsupervised video-based person re-identification (re-ID). To achieve it, we propose a novel Robust AnChor Embedding (RACE) framework via deep feature representation learning for large-scale unsupervised video re-ID. Within this framew...
3D mask spoofing attacks have been one of the main challenges in face recognition. Compared to a 3D mask, a real face displays different facial motion patterns that are reflected by different facial dynamic textures. However, a large portion of these facial motion differences are subtle. We find that the subtle facial motion can be fully captured b...
In large-scale camera networks, label information for person re-identification is usually not available under a large amount of cameras due to expensive human labor efforts. Semi-supervised learning could be employed to train a discriminative classifier by using unlabeled data and unmatched image pairs (negatives) generated from non-overlapping cam...
Cross-modality person re-identification between the thermal and visible domains is extremely important for night-time surveillance applications. Existing works in this filed mainly focus on learning sharable feature representations to handle the cross-modality discrepancies. However, besides the cross-modality discrepancy caused by different camera...
Person re-identification is widely studied in visible spectrum, where all the person images are captured by visible cameras. However, visible cameras may not capture valid appearance information under poor illumination conditions, e.g, at night. In this case, thermal camera is superior since it is less dependent on the lighting by using infrared li...
Tracking target of interests is an important step for motion perception in intelligent video surveillance systems. While most recently developed tracking algorithms are grounded in RGB image sequences, it should be noted that information from RGB modality is not always reliable (e.g. in a dark environment with poor lighting condition), which urges...
State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template...
In unsupervised domain adaptation, a key research problem is joint distribution alignment across the source and target domains. However, direct alignment of the source and target joint distributions is infeasible, because the target conditional distribution cannot be known without target labels. Instead of estimating target labels for target condit...
Although encouraging results have been obtained in human pose estimation in recent years, the performance may degrade dramatically when the image quality differs between training and testing datasets. This paper addresses problems in cross-image-quality human pose estimation. To achieve this, we follow unsupervised domain adaptation approach in whi...
The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in t...
Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity in...
Label estimation is an important component in an unsupervised person re-identification (re-ID) system. This paper focuses on cross-camera label estimation, which can be subsequently used in feature learning to learn robust re-ID models. Specifically, we propose to construct a graph for samples in each camera, and then graph matching scheme is intro...
State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we discuss the vulnerabilities of a face recognition system based on deep templates, extracte...
Because of appearance variations, training samples of the tracked targets collected by the online tracker are required for updating the tracking model. However, this often leads to tracking drift problem because of potentially corrupted samples: 1) contaminated/outlier samples resulting from large variations (e.g. occlusion, illumination), and 2) m...
In this paper, we propose a novel plant identification method based on multipath sparse coding using SIFT features, which avoids the need of feature engineering and the reliance on botanical taxonomy. In particular, the proposed method uses five paths to model the shape and texture features of plant images, and at each path it learns the dictionari...
Biometric cryptosystem has been proven to be a promising approach for template protection. Cryptosystems such as fuzzy extractor and fuzzy commitment require discriminative and informative binary biometric input to offer accurate and secure recognition. In multimodal biometric recognition, binary features can be produced via fusing the real-valued...
Regular medical records are useful for medical practitioners to analyze and monitor patient health status especially for those with chronic disease, but such records are usually incomplete due to unpunctuality and absence of patients. In order to resolve the missing data problem over time, tensor-based model is suggested for missing data imputation...
Regular medical records are useful for medical practitioners to analyze and monitor patient health status especially for those with chronic disease, but such records are usually incomplete due to unpunctuality and absence of patients. In order to resolve the missing data problem over time, tensor-based model is suggested for missing data imputation...
Search algorithms typically involve intensive distance computations and comparisons. In privacy-aware applications such as biometric identification, exposing the distance information may lead to compromise of sensitive data that have privacy and security implications. In this paper, we design an anonymized distance filter that can test and rank ins...
3D mask spoofing attack has been one of the main challenges in face recognition. Among existing methods, texture-based approaches show powerful abilities and achieve encouraging results on 3D mask face anti-spoofing. However, these approaches may not be robust enough in application scenarios and could fail to detect imposters with hyper-real masks....
The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-...
Using multiple features in appearance modeling has shown to be effective for visual tracking. In this paper, we dynamically measured the importance of different features and proposed a robust tracker with the weighted features. By doing this, the dictionaries are improved in both reconstructive and discriminative way. We extracted multiple features...
Self-occlusion is a challenging problem existing in human pose estimation. In this paper we exploit a new cue to solve this problem: the torso orientation. We describe a technique to automatically detect self-occlusion in training set without visibility label. Given this prior information, we are able to jointly learn an occlusion-aware model to ca...
Lack of information in occluded regions leads to ambiguity inherent, which is a big challenge for motion estimation. Recently, the sparse model has been widely used since the essential content of the motion field could be effectively preserved with sparse representation. The methods exploiting sparsity acquire representations either directly in the...
Multi-biometric feature-level fusion exploits feature information from more than one biometric source to improve recognition performance and template security. When ordered and unordered feature sets representing different biometric sources are involved, feature fusion becomes problematic. One way to mitigate this incompatibility problem is to tran...
Smart environments and monitoring systems are popular research areas nowadays due to its potential to enhance the quality of life. Applications such as human behaviour analysis and workspace ergonomics monitoring are automated, thereby improving well-being of individuals with minimal running cost. The central problem of smart environments is to und...
Person re-identification, which matches person images of the same identify across non-overlapping camera views, becomes an important component for cross-camera-view activity analysis. Most (if not all) person re-identification algorithms are designed based on appearance features. However, appearance features are not stable across non-overlapping ca...
Most existing tracking approaches are either based on the tracking by detection framework or the tracking by matching framework. The former needs to learn a discriminative classifier using positive and negative samples, which will cause tracking drift due to unreliable samples. The later usually performs tracking by matching local interest points b...
This paper addresses fully automated multi-person tracking in complex
environments with challenging occlusion and extensive pose variations. Our
solution combines multiple detectors for a set of different regions of interest
(e.g., full-body and head) for multi-person tracking. The use of multiple
detectors leads to fewer miss detections as it is a...
Retrieving pre-captured human motion for analyzing and synthesizing virtual character movement have been widely used in Virtual Reality (VR) and interactive computer graphics applications. In this paper, we propose a new human pose representation, called Spatial Relations of Human Body Parts (SRBP), to represent spatial relations between body parts...
Visual tracking using multiple features has been proved as a robust approach because features could complement each other. Since different types of variations such as illumination, occlusion and pose may occur in a video sequence, especially long sequence videos, how to properly select and fuse appropriate features has become one of the key problem...
Compact binary codes can in general improve the speed of searches in large-scale applications. Although fingerprint retrieval was studied extensively with real-valued features, only few strategies are available for search in Hamming space. In this paper, we propose a theoretical framework for systematically learning compact binary hash codes and de...
Biometric verification systems are designed to accept multiple similar biometric measurements per user due to inherent intrauser variations in the biometric data. This is important to preserve reasonable acceptance rate of genuine queries and the overall feasibility of the recognition system. However, such acceptance of multiple similar measurement...
Text in a scene provides vital information of its contents. With the increasing popularity of vision systems, detecting general text in images becomes a critical yet challenging task. Most existing methods have focused on extracting neatly-arranged text string for compactly-constructed characters. Motivated by the need to consider the widely varyin...
Symmetric Positive Definite (SPD) matrices in the form of region covariances are considered rich descriptors for images and videos. Recent studies suggest that exploiting the Riemannian geometry of the SPD manifolds could lead to improved performances for vision applications. For tasks involving processing large-scale and dynamic data in computer v...