Hu Han

Hu Han
  • Ph.D.
  • Professor at Institute of Computing Technology, Chinese Academy of Sciences

About

97
Publications
61,893
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,308
Citations
Introduction
https://sites.google.com/site/huhanhomepage I focus on the research of unconstrained and heterogeneous face recognition (e.g., sketch to mugshot), 3D face modeling , demographic (attribute) estimation, and soft-biometrics. 05/2015: I joined the Institute of Computing Technology, CAS as an associate professor. 01/2015-04/2015: I was a visiting scholar at Google ATAP, Mountain View, CA. 10/2011-01/2015: I was a Research Associate in the Dept. of Com. Sci. and Eng. at Michigan State University.
Current institution
Additional affiliations
May 2015 - present
Institute of Computing Technology, Chinese Academy of Sciences
Position
  • Professor (Associate)
Description
  • Computer vision and biometrics on the topics of unconstrained face recognition, heterogeneous face recognition, 3D face modeling, demographic attribute estimation, and soft biometric biometrics.
January 2015 - April 2015
Google Inc.
Position
  • Visiting researcher
Description
  • R&D of advanced biometric authentication technologies.
October 2011 - March 2015
Michigan State University
Position
  • Research Associate
Education
September 2005 - July 2011
Chinese Academy of Sciences
Field of study
  • Computer Science
September 2001 - July 2005
Shandong University
Field of study
  • Computer Science

Publications

Publications (97)
Preprint
Medical image segmentation is essential for clinical diagnosis, surgical planning, and treatment monitoring. Traditional approaches typically strive to tackle all medical image segmentation scenarios via one-time learning. However, in practical applications, the diversity of scenarios and tasks in medical image segmentation continues to expand, nec...
Article
Facial recognition (FR) technology offers convenience in our daily lives, but it also raises serious privacy issues due to unauthorized FR applications. To protect facial privacy, existing methods have proposed adversarial face examples that can fool FR systems. However, most of these methods work only in the digital domain and do not consider natu...
Preprint
Although multimodal large language models (MLLMs) have achieved promising results on a wide range of vision-language tasks, their ability to perceive and understand human faces is rarely explored. In this work, we comprehensively evaluate existing MLLMs on face perception tasks. The quantitative results reveal that existing MLLMs struggle to handle...
Conference Paper
Full-text available
Facial action unit (AU) recognition is essential for recognizing fine-grained changes in facial expression, while the demand for a large amount of accurately labeled AU data for training purposes has resulted in high labor costs. Nevertheless, massive face images are widely available and inaccurate labels can be easily obtained, especially as large...
Preprint
Although face analysis has achieved remarkable improvements in the past few years, designing a multi-task face analysis model is still challenging. Most face analysis tasks are studied as separate problems and do not benefit from the synergy among related tasks. In this work, we propose a novel task-adaptive multi-task face analysis method named as...
Chapter
Medical imaging is a non-invasive method for obtaining internal images of the human body or specific body parts by utilizing physical phenomena such as light, electric fields, magnetic fields, and sound waves. In clinical practice, modalities such as X-ray imaging, computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound are most...
Article
Blood vessel and surgical instrument segmentation is a fundamental technique for robot-assisted surgical navigation. Despite the significant progress in natural image segmentation, surgical image-based vessel and instrument segmentation are rarely studied. In this work, we propose a novel self-supervised pretraining method (SurgNet) that can effect...
Conference Paper
Full-text available
Existing studies indicate that deep neural networks (DNNs) can eventually memorize the label noise. We observe that the memorization strength of DNNs towards each instance is different and can be represented by the confidence value, which becomes larger and larger during the training process. Based on this, we propose a Dynamic Instance-specific Se...
Chapter
Affective behavior analysis has aroused researchers’ attention due to its broad applications. However, it is labor exhaustive to obtain accurate annotations for massive face images. Thus, we propose to utilize the prior facial information via Masked Auto-Encoder (MAE) pretrained on unlabeled face images. Furthermore, we combine MAE pretrained Visio...
Article
Semi-supervised learning (SSL) methods show their powerful performance to deal with the issue of data shortage in the field of medical image segmentation. However, existing SSL methods still suffer from the problem of unreliable predictions on unannotated data due to the lack of manual annotations for them. In this paper, we propose an unreliabilit...
Article
Cross-modality face image synthesis such as sketch-to-photo, NIR-to-RGB, and RGB-to-depth has wide applications in face recognition, face animation, and digital entertainment. Conventional cross-modality synthesis methods usually require paired training data, i.e., each subject has images of both modalities. However, paired data can be difficult to...
Chapter
Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis. Promising ULD results have been reported by multi-slice-input detection approaches which model 3D context from multiple adjacent CT slices, but such methods still experience difficulty in obtaining a global representation among different sli...
Preprint
Full-text available
Affective behaviour analysis has aroused researchers' attention due to its broad applications. However, it is labor exhaustive to obtain accurate annotations for massive face images. Thus, we propose to utilize the prior facial information via Masked Auto-Encoder (MAE) pretrained on unlabeled face images. Furthermore, we combine MAE pretrained Visi...
Preprint
Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis. Promising ULD results have been reported by multi-slice-input detection approaches which model 3D context from multiple adjacent CT slices, but such methods still experience difficulty in obtaining a global representation among different sli...
Conference Paper
Full-text available
In this paper, we propose a new method for remote photoplethysmography (rPPG) based heart rate (HR) estimation. In particular, our proposed method BVPNet is streamlined to predict the blood volume pulse (BVP) signals from face videos. Towards this, we firstly define ROIs based on facial landmarks and then extract the raw temporal signal from each R...
Chapter
Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis. Promising ULD results have been reported by coarse-to-fine two-stage detection approaches, but such two-stage ULD methods still suffer from issues like imbalance of positive v.s. negative anchors during object proposal and insufficient super...
Article
Apathy is characterized by symptoms such as reduced emotional response, lack of motivation, and limited social interaction. Current methods for apathy diagnosis require the patient’s presence in a clinic and time consuming clinical interviews, which are costly and inconvenient for both, patients and clinical staff, hindering among other large-scale...
Article
Occlusions are often present in face images in the wild, e.g., under video surveillance and forensic scenarios. Existing face de-occlusion methods are limited as they require the knowledge of an occlusion mask. To overcome this limitation, we propose in this paper a new generative adversarial network (named OA-GAN) for natural face de-occlusion wit...
Chapter
Remote physiological measurements, e.g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible. Since the amplitude of the physiological signals...
Preprint
Full-text available
Remote physiological measurements, e.g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible. Since the amplitude of the physiological signals...
Article
Full-text available
Face presentation attack detection (PAD) is essential for securing the widely used face recognition systems. Most of the existing PAD methods do not generalize well to unseen scenarios because labeled training data of the new domain is usually not available. In light of this, we propose an unsupervised domain adaptation with disentangled representa...
Conference Paper
Face presentation attack detection (PAD) has been an urgent problem to be solved in the face recognition systems. Conventional approaches usually assume the testing and training are within the same domain; as a result, they may not generalize well into unseen scenarios because the representations learned for PAD may overfit to the subjects in the t...
Preprint
Full-text available
Face presentation attack detection (PAD) has been an urgent problem to be solved in the face recognition systems. Conventional approaches usually assume the testing and training are within the same domain; as a result, they may not generalize well into unseen scenarios because the representations learned for PAD may overfit to the subjects in the t...
Preprint
Remote measurement of physiological signals from videos is an emerging topic. The topic draws great interests, but the lack of publicly available benchmark databases and a fair validation platform are hindering its further development. For this concern, we organize the first challenge on Remote Physiological Signal Sensing (RePSS), in which two dat...
Preprint
Full-text available
Combined variations containing low-resolution and occlusion often present in face images in the wild, e.g., under the scenario of video surveillance. While most of the existing face image recovery approaches can handle only one type of variation per model, in this work, we propose a deep generative adversarial network (FCSR-GAN) for performing join...
Article
Combined variations containing low-resolution and occlusion often present in face images in the wild, e.g., under the scenario of video surveillance. While most of the existing face image recovery approaches can handle only one type of variation per model, in this work, we propose a deep generative adversarial network (FCSR-GAN) for performing join...
Preprint
Heart rate (HR) is an important physiological signal that reflects the physical and emotional status of a person. Traditional HR measurements usually rely on contact monitors, which may cause inconvenience and discomfort. Recently, some methods have been proposed for remote HR estimation from face videos; however, most of them focus on well-control...
Preprint
Facial action units (AUs) recognition is essential for emotion analysis and has been widely applied in mental state analysis. Existing work on AU recognition usually requires big face dataset with AU labels; however, manual AU annotation requires expertise and can be time-consuming. In this work, we propose a semi-supervised approach for AU recogni...
Article
Heart rate (HR) is an important physiological signal that reflects the physical and emotional status of a person. Traditional HR measurements usually rely on contact monitors, which may cause inconvenience and discomfort. Recently, some methods have been proposed for remote HR estimation from face videos; however, most of them focus on well-control...
Conference Paper
Full-text available
Face recognition (FR) is being widely used in many applications from access control to smartphone unlock. As a result, face presentation attack detection (PAD) has drawn increasing attentions to secure the FR systems. Traditional approaches for PAD mainly assume that training and testing scenarios are similar in imaging conditions (illu-mination, s...
Conference Paper
Full-text available
Face presentation attack detection (PAD) has drawn increasing attentions to secure face recognition (FR) systems which are being widely used in many applications from access control to smartphone unlock. Traditional approaches for PAD may lack good generalization capability into new application scenarios due to the limited number of subjects and da...
Chapter
In the past few years, great efforts have been devoted to scene text detection. Nevertheless, efficient text detection in the wild remains a challenging problem. Methods for general object detection usually have limitations in handling the arbitrary orientations and large aspect ratios of scene text. In this paper, we present a novel scene text det...
Chapter
Multi-label classification is an essential problem in image classification, because there are usually multiple related tags associated with each image. However, building a large scale multi-label dataset with clean labels can be very expensive and difficult. Therefore, utilizing a small set of data with verified labels and massive data with noise l...
Chapter
Heart rate (HR) is an important physiological signal that reflects the physical and emotional activities of humans. Traditional HR measurements are mainly based on contact monitors, which are inconvenient and may cause discomfort for the subjects. Recently, methods have been proposed for remote HR estimation from face videos. However, most of the e...
Conference Paper
Full-text available
In this work, we propose an end-to-end approach for robust remote heart rate (HR) measurement gleaned from facial videos. Specifically the approach is based on remote pho-toplethysmography (rPPG), which constitutes a pulse triggered perceivable chromatic variation, sensed in RGB-face videos. Incidentally rPPGs can be get affected in less-constraine...
Article
The explosive growth of digital images in video surveillance and social media has led to the significant need for efficient search of persons of interest in law enforcement and forensic applications. Despite tremendous progress in primary biometric traits (e.g., face and fingerprint) based person identification, a single biometric trait alone can n...
Preprint
The explosive growth of digital images in video surveillance and social media has led to the significant need for efficient search of persons of interest in law enforcement and forensic applications. Despite tremendous progress in primary biometric traits (e.g., face and fingerprint) based person identification, a single biometric trait alone canno...
Preprint
Heart rate (HR) is an important physiological signal that reflects the physical and emotional activities of humans. Traditional HR measurements are mainly based on contact monitors, which are inconvenient and may cause discomfort for the subjects. Recently, methods have been proposed for remote HR estimation from face videos. However, most of the e...
Conference Paper
In this paper, we propose an automatic engagement prediction method for the Engagement in the Wild sub-challenge of EmotiW 2018. We first design a novel Gaze-AU-Pose (GAP) feature taking into account the information of gaze, action units and head pose of a subject. The GAP feature is then used for the subsequent engagement level prediction. To effi...
Chapter
RGB-D face recognition (FR) has drawn increasing attention in recent years with the advances of new RGB-D sensing technologies, and the decrease in sensor price. While a number of multi-modality fusion methods are available in face recognition, there is not known conclusion how the RGB and depth should be fused. We provide a comparative study of fo...
Article
Face attribute estimation has many potential applications in video surveillance, face retrieval, and social media. While a number of methods have been proposed for face attribute estimation, most of them did not explicitly consider the attribute correlation and heterogeneity (e.g., ordinal vs. nominal attributes) during feature representation learn...
Conference Paper
Full-text available
Action recognition has wide applications from video surveillance , scene understanding to forensic investigation. While recent methods typically focus on a single action recognition from video clips, we investigate the problem of action recognition in crowd, which better repli-cates real video surveillance scenarios. We propose to perform actions r...
Conference Paper
Full-text available
With the wide applications of user authentication based on face recognition, face spoof attacks against face recognition systems are drawing increasing attentions. While emerging approaches of face an-tispoofing have been reported in recent years, most of them limit to the non-realistic intra-database testing scenarios instead of the cross-database...
Article
With the wide deployment of face recognition systems in applications from de-duplication to mobile device unlocking,security against face spoofing attacks requires increased attention; such attacks can be easily launched via printed photos, video replays and 3D masks of a face. We address the problem of face spoof detection against print (photo) an...
Article
Full-text available
Demographic estimation entails automatic estimation of age, gender and race of a person from his face image, which has many potential applications ranging from forensics to social media. Automatic demographic estimation, particularly age estimation, remains a challenging problem because persons belonging to the same demographic group can be vastly...
Conference Paper
Full-text available
With the wide deployment of face recognition systems in applications from border control to mobile device unlocking, the combat of face spoofing attacks requires increased attention; such attacks can be easily launched via printed photos, video replays and 3D masks. We address the problem of facial spoofing detection against replay attacks based on...
Conference Paper
Full-text available
Mobile devices can carry large amounts of personal data, but are often left unsecured. PIN locks are inconvenient to use and thus have seen low adoption (33% of users). While biometrics are beginning to be used for mobile device authentication, they are used only for initial unlock. Mobile devices secured with only login authentication are still vu...
Article
Full-text available
Automatic face recognition is now widely used in applications ranging from deduplication of identity to authentication of mobile payment. This popularity of face recognition has raised concerns about face spoof attacks (also known as biometric sensor presentation attacks), where a photo or video of an authorized person’s face could be used to gain...
Article
Face recognition in surveillance systems is important for security applications, especially in nighttime scenarios when the subject is far away from the camera. However, due to the face image quality degradation caused by large camera standoff and low illuminance, nighttime face recognition at large standoff is challenging. In this paper, we report...
Article
Full-text available
As face recognition applications progress from constrained sensing and cooperative subjects scenarios (e.g., driver’s license and passport photos) to unconstrained scenarios with uncooperative subjects (e.g., video surveillance), new challenges are encountered. These challenges are due to variations in ambient illumination, image resolution, backgr...
Article
Full-text available
Facial composites are widely used by law enforcement agencies to assist in the identification and apprehension of suspects involved in criminal activities. These composites, generated from witness descriptions, are posted in public places and in the media with the hope that some viewers will provide tips about the identity of the suspect. This meth...
Technical Report
Facial composites are widely used by law enforcement agencies to assist in the identification and apprehension of suspects involved in criminal activities. These composites, generated from witness descriptions, are posted in public places and in the media with the hope that some viewers will provide tips about the identity of the suspect. This meth...
Technical Report
Automatic estimation of demographic attributes (e.g., age, gender, and race) from a face image is a topic of growing interest with many potential applications. Most prior work on this topic has used face images acquired under constrained and cooperative scenarios. This paper addresses the more challenging problem of automatic age, gender, and race...
Technical Report
Full-text available
As face recognition applications progress from constrained sensing and cooperative subjects scenarios (e.g., driver’s license and passport photos) to unconstrained scenarios with uncooperative subjects (e.g., video surveillance), new challenges are encountered. These challenges are due to variations in ambient illumination, image resolution, backgr...
Article
In person identification, recognition failure due to variations of illumination is common. In this study, we employed image-processing techniques to tackle this problem. Participants performed recognition and matching tasks where the face stimuli were either original images or computer-processed images in which shading was weakened via a number of...
Conference Paper
Full-text available
One of the major challenges encountered by face recognition lies in the difficulty of handling arbitrary poses variations. While different approaches have been developed for face recognition across pose variations, many methods either require manual landmark annotations or assume the face poses to be known. These constraints prevent many face recog...
Conference Paper
Full-text available
There has been a growing interest in automatic age estimation from facial images due to a variety of potential applications in law enforcement, security control, and human-computer interaction. However, despite advances in automatic age estimation, it remains a challenging problem. This is because the face aging process is determined not only by in...
Conference Paper
Full-text available
Tattoos on human body provide important clue to the identity of a suspect. While a tattoo is not an unique identifier, it narrows down the list of identities for the suspect. For these reasons, law enforcement agencies have been collecting tattoo images of the suspects at the time of booking. A few successful attempts have been made to design an au...
Conference Paper
Full-text available
Facial sketches are widely used by law enforcement agencies to assist in the identification and apprehension of suspects involved in criminal activities. Sketches used in forensic investigations are either drawn by forensic artists (forensic sketches) or created with computer software (composite sketches) following the verbal description provided b...
Article
Illumination preprocessing is an effective and efficient approach in handling lighting variations for face recognition. Despite much attention to face illumination preprocessing, there is seldom systemic comparative study on existing approaches that presents fascinating insights and conclusions in how to design better illumination preprocessing met...
Article
Lighting normalization is a kind of widely used approach for achieving illumination invariant face recognition. Lighting normalization approaches try to regularize various lighting conditions in different face images into ideal illumination before face recognition. However, many existing methods perform lighting normalization by treating face image...
Article
Full-text available
The problem of automatically matching composite sketches to facial photographs is addressed in this paper. Previous research on sketch recognition focused on matching sketches drawn by professional artists who either looked directly at the subjects (viewed sketches) or used a verbal description of the subject's appearance as provided by an eyewitne...
Technical Report
When a crime occurs and a facial photograph of the suspect is not available (from a surveillance camera or a mobile phone), law enforcement agencies often use a facial sketch to help identify and capture the suspect. Typically, the facial sketch of a suspect is released to the public via news-papers and television so that citizens can identify the...
Conference Paper
Full-text available
In the last decade, some illumination preprocessing approaches were proposed to eliminate the lighting variation in face images for lighting-invariant face recognition. However, we find surprisingly that existing preprocessing methods were seldom modeled to directly enhance the separability of different faces, which should have been the essential g...
Conference Paper
Full-text available
There is growing interest in achieving age invariant face recognition due to its wide applications in law enforcement. The challenge lies in that face aging is quite a complicated process, which involves both intrinsic and extrinsic factors. Face aging also influences individual facial components (such as the mouth, eyes, and nose) differently. We...
Conference Paper
3D face modeling from 2D face images is of significant importance for face analysis, animation and recognition. Previous research on this topic mainly focused on 3D face modeling from a single 2D face image; however, a single face image can only provide a limited description of a 3D face. In many applications, for example, law enforcement, multi-vi...
Conference Paper
Full-text available
Illumination variation is one of intractable yet crucial problems in face recognition and many lighting normalization approaches have been proposed in the past decades. Nevertheless, most of them preprocess all the face images in the same way thus without considering the specific lighting in each face image. In this paper, we propose a lighting awa...
Conference Paper
Full-text available
Illumination variation has been one of the most intractable problems in face recognition and many approaches have been proposed to handle illumination problem in the last decades of years. The key problem is how to get stable similarity measurements between two face images of the same individual but captured under dramatically different lighting co...
Conference Paper
Full-text available
Today's camera sensors usually have a high gray-scale resolution, e.g. 256, however, due to the dramatic lighting variations, the gray-scales distributed to the face region might be far less than 256. Therefore, besides low spatial resolution, a practical face recognition system must also handle degraded face images of low gray-scale resolution (LG...
Conference Paper
Full-text available
In this paper, we propose a novel homomorphic wavelet filtering based illumination transfer technique to change the dominant lighting of one face image (source face image) to another face image (reference face image ). Specifically, in the proposed method, based on the ldquoreflectance-illuminationrdquo imaging model, we first obtain an approximate...

Network

Cited By