Guoying Zhao

Guoying Zhao
University of Oulu · Center for Machine Vision and Signal Analysis

PhD

About

401
Publications
149,165
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
25,728
Citations
Additional affiliations
September 2002 - July 2005
Chinese Academy of Sciences
Position
  • Research Assistant
July 2005 - present
University of Oulu
Position
  • Professor (Associate)

Publications

Publications (401)
Chapter
Incremental Deepfake Detection (IDD) aims to continuously update models with new domain data, adapting to evolving forgery techniques. Existing works require extra buffers to store old exemplars for maintaining previously learned knowledge. However, it is infeasible when previous data is unavailable due to storage and privacy issues. This paper foc...
Chapter
Face forgery detection is crucial in preserving the security and integrity of facial data amidst the rapid developments in face manipulation techniques and deep generative models. Existing methods for video face forgery detection typically assume that all frames in a forged video are manipulated, while identifying partially forged videos with only...
Preprint
Full-text available
Generative models have surged in popularity recently due to their ability to produce high-quality images and video. However, steering these models to produce images with specific attributes and precise control remains challenging. Humans, particularly their faces, are central to content generation due to their ability to convey rich expressions and...
Preprint
Full-text available
Remote photoplethysmography (rPPG) is a non-contact method for measuring cardiac signals from facial videos, offering a convenient alternative to contact photoplethysmography (cPPG) obtained from contact sensors. Recent studies have shown that each individual possesses a unique cPPG signal morphology that can be utilized as a biometric identifier,...
Article
Socially shared regulation plays a pivotal role in the success of collaborative learning. However, evaluating socially shared regulation of learning (SSRL) proves challenging due to the dynamic and infrequent cognitive and socio-emotional interactions, which constitute the focal point of SSRL. To address this challenge, this paper gathers interdisc...
Article
Sketch re-identification (Re-ID) seeks to match pedestrians' photos from surveillance videos with corresponding sketches. However, we observe that existing works still have two critical limitations: (i) cross- and intra-modality discrepancies hinder the extraction of modality-shared features, (ii) standard triplet loss fails to constrain latent fea...
Article
Full-text available
Capacity-limited visual working memory (VWM) requires that individuals have sufficient memory space and the ability to filter distractors. Negative emotional states are known to impact VWM storage, yet their influence on distractor filtering within VWM remains underexplored. We conducted direct neural measurement of participants (n = 56) who conduc...
Article
Full-text available
The topic of achieving rotational invariance in convolutional neural networks (CNNs) has gained considerable attention recently, as this invariance is crucial for many computer vision tasks. In this letter, we propose a sorting convolution operation ( SConv ), which achieves invariance to arbitrary rotations without additional learnable parameters...
Article
Remote photoplethysmography (rPPG) has considerable significance in areas such as disease diagnosis and emotion analysis. Recent rPPG models have demonstrated excellent performance due to their powerful heart rate information extraction capabilities. However, these models often focus on limited regions of interest (ROI) on facial image, which makes...
Article
Remote photoplethysmography (rPPG) is an essential way of monitoring the physiological indicator heart rate (HR), which has important guiding significance for preventing and controlling cardiovascular diseases. However, most existing HR measurement approaches require ideal illumination conditions, and the illumination variation in a realistic situa...
Article
Full-text available
Micro-expression (ME) is an involuntary, fleeting, and subtle facial expression. It may occur in high-stake situations when people attempt to conceal or suppress their true feelings. Therefore, MEs can provide essential clues to people’s true feelings and have plenty of potential applications, such as national security, clinical diagnosis, and inte...
Article
Micro-expression recognition (MER) draws intensive research interest as micro-expressions (MEs) can infer genuine emotions. Prior information can guide the model to learn discriminative ME features effectively. However, most works focus on researching the general models with a stronger representation ability to adaptively aggregate ME movement info...
Preprint
Capacity-limited visual working memory (VWM) requires that individuals have sufficient memory space and the ability to filter distractors. Negative emotional states are known to impact VWM storage, yet their influence on distractor filtering within VWM remains underexplored. We conducted direct neural measurement of participants (n=56) who conducte...
Conference Paper
Full-text available
Despite the importance of socially shared regulation of learning (SSRL) in collaborative learning success, there remains a paucity of evidence on how it could be detected and supported effectively. In this paper, we present an experimental study with a systematic analysis approach that utilizes facial expression recognition technology to examine em...
Presentation
When we are in a negative emotional state, our brains are often influenced, which can affect cognitive processes such as visual working memory (VWM). VWM is a short-term memory system used to temporarily store and manipulate visual information. The effect of negative emotional state on performance of VWM has been extensively studied. However, the i...
Preprint
The topic of achieving rotational invariance in convolutional neural networks (CNNs) has gained considerable attention recently, as this invariance is crucial for many computer vision tasks such as image classification and matching. In this letter, we propose a Sorting Convolution (SC) inspired by some hand-crafted features of texture images, which...
Chapter
Modelling temporal dependencies is important for accurate action detection. In this work, we develop a temporal attention unit to mine the global dependencies among features from different temporal locations. Additionally, based on the developed temporal attention unit, we propose an attention-guided boundary refinement module for revising action p...
Chapter
Micro-expressions (MEs) are subtle, quick and involuntary facial muscle movements. Action unit (AU) detection plays an important role in facial micro-expression analysis due to the ambiguity of MEs. Unlike typical AU detection that is performed on macro-expressions, the facial muscle movements are significantly more subtle in MEs. This makes the de...
Preprint
Full-text available
Over the past few decades, multimodal emotion recognition has made remarkable progress with the development of deep learning. However, existing technologies are difficult to meet the demand for practical applications. To improve the robustness, we launch a Multimodal Emotion Recognition Challenge (MER 2023) to motivate global researchers to build i...
Preprint
Rotational motion blur caused by the circular motion of the camera or/and object is common in life. Identifying objects from images affected by rotational motion blur is challenging because this image degradation severely impacts image quality. Therefore, it is meaningful to develop image invariant features under rotational motion blur and then use...
Preprint
Micro-expression recognition (MER) draws intensive research interest as micro-expressions (MEs) can infer genuine emotions. Prior information can guide the model to learn discriminative ME features effectively. However, most works focus on researching the general models with a stronger representation ability to adaptively aggregate ME movement info...
Chapter
Full-text available
Face presentation attack detection (PAD) has received increasing attention ever since the vulnerabilities to spoofing have been widely recognized. The state of the art in unimodal and multi-modal face anti-spoofing has been assessed in eight international competitions organized in conjunction with major biometrics and computer vision conferences in...
Article
Full-text available
Background Pain can have a significant impact on an individual's life, as it has both cognitive and affective consequences. However, our understanding of how pain affects social cognition is limited. Previous studies have shown that pain, as an alarm stimulus, can disrupt cognitive processing when focal attention is required, but whether pain also...
Article
Full-text available
We explore using body gestures for hidden emotional state analysis. As an important non-verbal communicative fashion, human body gestures are capable of conveying emotional information during social communication. In previous works, efforts have been made mainly on facial expressions, speech, or expressive body gestures to interpret classical expre...
Article
Full-text available
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited s...
Preprint
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited s...
Preprint
Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience. Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales. However, this...
Article
Micro-expressions have drawn increasing interest lately due to various potential applications. The task is, however, difficult as it incorporates many challenges from the fields of computer vision, machine learning and emotional sciences. Due to the spontaneous and subtle characteristics of micro-expressions, the available training and testing data...
Article
Remote photoplethysmography (rPPG) is a vital way of measuring heart rate (HR) to reflect human physical and mental health, which is useful for diagnosing cardiovascular and neurological diseases. Many non-contact HR estimation methods have been proposed gradually in recent years, but the majority of approaches are based on a single-modal HR inform...
Article
Full-text available
Deep models for facial expression recognition achieve high performance by training on large-scale labeled data. However, publicly available datasets contain uncertain facial expressions caused by ambiguous annotations or confusing emotions, which could severely decline the robustness. Previous studies usually follow the bias elimination method in g...
Article
Micro-expression recognition (MER) holds significance in uncovering hidden emotions. Most works take image sequences as input and cannot effectively explore ME information because subtle ME-related motions are easily submerged in unrelated information. Instead, the facial landmark is a lowdimensional and compact modality, which achieves lower compu...
Article
Full-text available
Survey/review study From Emotion AI to Cognitive AI Guoying Zhao *, Yante Li , and Qianru Xu University of Oulu, Pentti Kaiteran Katu 1, Linnanmaa 90570, Finland * Correspondence: guoying.zhao@oulu.fi Received: 22 September 2022 Accepted: 28 November 2022 Published: 22 December 2022 Abstract: Cognitive computing is recognized as the next era of com...
Preprint
Full-text available
Deep models for facial expression recognition achieve high performance by training on large-scale labeled data. However, publicly available datasets contain uncertain facial expressions caused by ambiguous annotations or confusing emotions, which could severely decline the robustness. Previous studies usually follow the bias elimination method in g...
Preprint
Full-text available
Micro-expressions have drawn increasing interest lately due to various potential applications. The task is, however, difficult as it incorporates many challenges from the fields of computer vision, machine learning and emotional sciences. Due to the spontaneous and subtle characteristics of micro-expressions, the available training and testing data...
Preprint
Full-text available
In recent years, convolutional neural network has shown good performance in many image processing and computer vision tasks. However, a standard CNN model is not invariant to image rotations. In fact, even slight rotation of an input image will seriously degrade its performance. This shortcoming precludes the use of CNN in some practical scenarios....
Article
Full-text available
Face anti-spoofing (FAS) has lately attracted increasing attention due to its vital role in securing face recognition systems from presentation attacks (PAs). As more and more realistic PAs with novel types spring up, early-stage FAS methods based on handcrafted features become unreliable due to their limited representation capacity. With the emerg...
Preprint
Full-text available
Collaborative learning is an educational approach that enhances learning through shared goals and working together. Interaction and regulation are two essential factors related to the success of collaborative learning. Since the information from various modalities can reflect the quality of collaboration, a new multimodal dataset with cognitive and...
Article
Full-text available
Being spontaneous, micro-expressions are useful in the inference of a person's true emotions even if an attempt is made to conceal them. Due to their short duration and low intensity, the recognition of micro-expressions is a difficult task in affective computing. The early work based on handcrafted spatio-temporal features which showed some promis...
Article
Full-text available
Micro-expressions (MEs) are involuntary facial movements revealing people's hidden feelings in high-stake situations and have practical importance in various fields. Early methods for Micro-expression Recognition (MER) are mainly based on traditional features. Recently, with the success of Deep Learning (DL) in various tasks, neural networks have r...
Article
Video object segmentation (VOS) is a critical yet challenging task in video analysis. Recently, many pixel-level matching VOS methods have achieved an outstanding performance without significant time consumption in fine-tuning. However, most of these methods pay little attention to (i) matching background pixels and (ii) optimizing discriminable em...
Preprint
Full-text available
Pain can significantly impact our lives, and it has both cognitive and affective consequences for individuals. Nevertheless, our understanding of the effects of pain on social cognition remains limited. We investigated how pain affected the perception of facial emotions by recording brain responses to task-irrelevant faces in healthy participants....
Article
Emotion recognition from body gestures is challenging since similar emotions can be expressed by arbitrary spatial configurations of joints, which results in relying on modeling spatial-temporal patterns from a more global level. However, most recent powerful graph convolution networks (GCNs) separate the spatial and temporal modeling into isolated...
Article
We present a customized 3D mesh Transformer model for the pose transfer task. As the 3D pose transfer essentially is a deformation procedure dependent on the given meshes, the intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism. Specifically, we propose a novel geomet...
Preprint
Full-text available
Micro-expression recognition (MER) is valuable because the involuntary nature of micro-expressions (MEs) can reveal genuine emotions. Most works recognize MEs by taking RGB videos or images as input. In fact, the activated facial regions in ME images are very small and the subtle motion can be easily submerged in the unrelated information. Facial l...
Preprint
Full-text available
High-quality annotated images are significant to deep facial expression recognition (FER) methods. However, uncertain labels, mostly existing in large-scale public datasets, often mislead the training process. In this paper, we achieve uncertain label correction of facial expressions using auxiliary action unit (AU) graphs, called ULC-AG. Specifica...
Preprint
Full-text available
The inherent slow imaging speed of Magnetic Resonance Image (MRI) has spurred the development of various acceleration methods, typically through heuristically undersampling the MRI measurement domain known as k-space. Recently, deep neural networks have been applied to reconstruct undersampled k-space data and have shown improved reconstruction per...
Preprint
Full-text available
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from presentation attacks. Benefitted from the maturing camera sensors, single-modal (RGB) and multi-modal (e.g., RGB+Depth) FAS has been applied in various scenarios with different configurations of sensors/modalities. Existing single- and multi-modal FAS methods usua...
Article
Full-text available
Pain is a complex phenomenon, the experience of which varies widely across individuals. At worst, chronic pain can lead to anxiety and depression. Cost-effective strategies are urgently needed to improve the treatment of pain, and thus we propose a novel home-based pain measurement system for the longitudinal monitoring of pain experience and varia...
Article
Single image super-resolution is an ill-posed problem, whose purpose is to acquire a high-resolution image from its degraded observation. Existing deep learning-based methods are compromised on their performance and speed due to the heavy design (i.e., huge model size) of networks. In this paper, we propose a novel high-performance cross-domain het...
Preprint
With the development of data acquisition technology, multi-channel data is collected and widely used in many fields. Most of them can be expressed as various types of vector functions. Feature extraction of vector functions for identifying certain patterns of interest is a critical but challenging task. In this paper, we focus on constructing momen...
Article
Full-text available
Micro-expressions (ME) are a special form of facial expressions which may occur when people try to hide their true feelings for some reasons. MEs are important clues to reveal people’s true feelings, but are difficult or impossible to be captured by ordinary persons with naked-eyes as they are very short and subtle. It is expected that robust compu...
Article
Full-text available
As one of the most important affective signals, facial affect analysis (FAA) is essential for developing human-computer interaction systems. Early methods focus on extracting appearance and geometry features associated with human affects while ignoring the latent semantic information among individual facial changes, leading to limited performance a...
Article
Full-text available
Recently, hyperbolic deep neural networks (HDNNs) have been gaining momentum as the deep representations in the hyperbolic space provide high fidelity embeddings with few dimensions, especially for data possessing hierarchical structure. Such a hyperbolic neural architecture is quickly extended to many different scientific fields, including natural...
Preprint
Full-text available
Face presentation attack detection (PAD) has received increasing attention ever since the vulnerabilities to spoofing have been widely recognized. The state of the art in unimodal and multi-modal face anti-spoofing has been assessed in eight international competitions organized in conjunction with major biometrics and computer vision conferences in...
Preprint
We present a customized 3D mesh Transformer model for the pose transfer task. As the 3D pose transfer essentially is a deformation procedure dependent on the given meshes, the intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism. Specifically, we propose a novel geomet...