About
51
Publications
54,950
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,270
Citations
Introduction
I am a PhD candidate in National Laboratory of Pattern Recognition in the Institute of Automation, Chinese Academic of Sciences. My supervisor is Prof. Stan Z. Li. Before that, I received the B. Eng degree from School of Communication and Information Engineering in University of Electronic Science and Technology of China in 2015.
My research interest includes machine learning and pattern recognition, with a focus on object detection, face detection, pedestrian detection and video detection.
Skills and Expertise
Current institution
Publications
Publications (51)
In face recognition, designing margin-based (e.g., angular, additive, additive angular margins) softmax loss functions plays an important role in learning discriminative features. However, these hand-crafted heuristic methods are sub-optimal because they require much effort to explore the large design space. Recently, an AutoML for loss function se...
Face detection has achieved significant progress in recent years. However, high performance face detection still remains a very challenging problem, especially when there exists many tiny faces. In this paper, we present a single-shot refinement face detector namely RefineFace to achieve high performance. Specifically, it consists of five modules:...
Convolutional neural network based methods have dominated object detection in recent years, which can be divided into the one-stage approach and the two-stage approach. In general, the two-stage approach ( e.g., Faster R-CNN) achieves high accuracy, while the one-stage approach ( e.g., SSD) has the advantage of high efficiency. To inherit the merit...
Face recognition has witnessed significant progress due to the advances of deep convolutional neural networks (CNNs), the central task of which is how to improve the feature discrimination. To this end, several margin-based (e.g., angular, additive and additive angular margins) softmax loss functions have been proposed to increase the feature margi...
Pedestrian detection in crowded scenes is a challenging problem, because occlusion happens frequently among different pedestrians. In this paper, we propose an effective and efficient detection network to hunt pedestrians in crowd scenes. The proposed method, namely PedHunter, introduces strong occlusion handling ability to existing region-based de...
Head and human detection have been rapidly improved with the development of deep convolutional neural networks. However, these two tasks are often studied separately without considering their inherent correlation, leading to that 1) head detection is often trapped in more false positives, and 2) the performance of human detector frequently drops dr...
Existing enhancement methods are empirically expected to help the high-level end computer vision task: however, that is observed to not always be the case in practice. We focus on object or face detection in poor visibility enhancements caused by bad weathers (haze, rain) and low light conditions. To provide a more thorough examination and fair com...
Object detection has been dominated by anchor-based detectors for several years. Recently, anchor-free detectors have become popular due to the proposal of FPN and Focal Loss. In this paper, we first point out that the essential difference between anchor-based and anchor-free detection is actually how to define positive and negative training sample...
Face recognition has witnessed significant progress due to the advances of deep convolutional neural networks (CNNs), the central task of which is how to improve the feature discrimination. To this end, several margin-based (\textit{e.g.}, angular, additive and additive angular margins) softmax loss functions have been proposed to increase the feat...
Human and head detection have been rapidly improved with the development of deep convolutional neural networks. However, these two detection tasks are often studied separately, without taking advantage of the relationship between human and head. In this paper, we present a new two-stage detection framework, namely Joint Enhancement Detection (JED),...
Pedestrian detection has achieved significant progress with the availability of existing benchmark datasets. However, there is a gap in the diversity and density between real world requirements and current pedestrian detection benchmarks: 1) most of existing datasets are taken from a vehicle driving through the regular traffic scenario, usually lea...
Head and human detection have been rapidly improved with the development of deep convolutional neural networks. However, these two tasks are often studied separately without considering their inherent correlation, leading to that 1) head detection is often trapped in more false positives, and 2) the performance of human detector frequently drops dr...
Pedestrian detection in crowded scenes is a challenging problem, because occlusion happens frequently among different pedestrians. In this paper, we propose an effective and efficient detection network to hunt pedestrians in crowd scenes. The proposed method, namely PedHunter, introduces strong occlusion handling ability to existing region-based de...
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects ($\le\negmedspace170$) and modalities ($\leq\negmedspace2$), w...
High performance face detection remains a very challenging problem, especially when there exists many tiny faces. This paper presents a novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel two-step classification and regression operations selectively into an anchor-based face detector to reduce false pos...
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤ 170) and modalities (≤ 2), which hinder the further develop...
Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification dataset ImageNet, which incurs some additional problems: 1) The classification and detection have different degrees of sensitivity to translation, resulting in the learning objective bias; 2) The architecture is limited...
High performance face detection remains a very challenging problem, especially when there exists many tiny faces. This paper presents a novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel twostep classification and regression operations selectively into an anchor-based face detector to reduce false posi...
Pedestrian detection has achieved significant progress with the availability of exiting benchmark datasets. However, there is a gap in the diversity and density between real world requirements and current pedestrian detection benchmarks: 1) most of existing datasets are taken from a vehicle driving through the regular traffic scenario, usually lead...
Although tremendous strides have been made in face detection, one of the remaining open issues is to achieve CPU real-time speed as well as maintain high performance, since effective models for face detection tend to be computationally prohibitive. To address this issue, we propose a novel face detector, named FaceBoxes, with superior performance o...
In this work, we describe a single-shot scale-aware convolutional neural network based face detector (SFDet). In comparison with the state-of-the-art anchor-based face detection methods, the main advantages of our method are summarized in four aspects. (1) We propose a scale-aware detection network using a wide scale range of layers associated with...
This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian. The challenge focuses on the problem of precise localization of human faces and bodies, and accurate association of identities. It comprises of three tracks: (i) WIDER Face which aims at soliciting new approaches to advance the state-of-the-art in face detection, (ii)...
As a long-standing problem in computer vision, face detection has attracted much attention in recent decades for its practical applications. With the availability of face detection benchmark WIDER FACE dataset, much of the progresses have been made by various algorithms in recent years. Among them, the Selective Refinement Network (SRN) face detect...
Face recognition has witnessed significant progresses due to the advances of deep convolutional neural networks (CNNs), the central challenge of which, is feature discrimination. To address it, one group tries to exploit mining-based strategies (\textit{e.g.}, hard example mining and focal loss) to focus on the informative examples. The other group...
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects ($\le\negmedspace170$) and modalities ($\leq\negmedspace2$), w...
Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification datasets like ImageNet, which incurs some accessory problems: 1) the domain gap between source and target datasets; 2) the learning objective bias between classification and detection; 3) the architecture limitations of...
Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification datasets like ImageNet, which incurs some accessory problems: 1) the domain gap between source and target datasets; 2) the learning objective bias between classification and detection; 3) the architecture limitations of...
High performance face detection remains a very challenging problem, especially when there exists many tiny faces. This paper presents a novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel two-step classification and regression operations selectively into an anchor-based face detector to reduce false pos...
Pedestrian detection in crowded scenes is a challenging problem since the pedestrians often gather together and occlude each other. In this paper, we propose a new occlusion-aware R-CNN (OR-CNN) to improve the detection accuracy in the crowd. Specifically, we design a new aggregation loss to enforce proposals to be close and locate compactly to the...
High performance face detection remains a very challenging problem, especially when there exists many tiny faces. This paper presents a novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel two-step classification and regression operations selectively into an anchor-based face detector to reduce false pos...
Multispectral pedestrian detection is an emerging solution with great promise in many around-the-clock applications, such as automotive driving and security surveillance. To exploit the complementary nature and remedy contradictory appearance between modalities, in this paper, we propose a novel cross-modality interactive attention network that tak...
Although face detection has taken a big step forward with the development of anchor based face detector, the issue of effective detection of faces with different scales still remains. To solve this problem, we present an one-stage face detector, named Single Shot Attention-Based Face Detector (AFD), which enables accurate detection of multi-scale f...
Pedestrian detection in crowded scenes is a challenging problem since the pedestrians often gather together and occlude each other. In this paper, we propose a new occlusion-aware R-CNN (OR-CNN) to improve the detection accuracy in the crowd. Specifically, we design a new aggregation loss to enforce proposals to be close and locate compactly to the...
Pedestrian detection in crowded scenes is a challenging problem since the pedestrians often gather together and occlude each other. In this paper, we propose a new occlusion-aware R-CNN (OR-CNN) to improve the detection accuracy in the crowd. Specifically, we design a new aggregation loss to enforce proposals to be close and locate compactly to the...
Softmax loss is arguably one of the most popular losses to train CNN models for image classification. However, recent works have exposed its limitation on feature discriminability. This paper casts a new viewpoint on the weakness of softmax loss. On the one hand, the CNN features learned using the softmax loss are often inadequately discriminative....
For object detection, the two-stage approach (e.g., Faster R-CNN) has been achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has the advantage of high efficiency. To inherit the merits of both while overcoming their disadvantages, in this paper, we propose a novel single-shot based detector, called RefineDet, that achieves...
Although face detection has taken a big step forward with the development of anchor based face detector, the issue of effective detection of faces with different scales still remains. To solve this problem, we present an one-stage face detector, named Single Shot Attention-Based Face Detector (AFD), which enables accurate detection of multi-scale f...
Softmax loss is arguably one of the most popular losses to train CNN models for image classification. However, recent works have exposed its limitation on feature discriminability. This paper casts a new viewpoint on the weakness of softmax loss. On the one hand, the CNN features learned using the softmax loss are often inadequately discriminative....
Accuracy and efficiency are two conflicting challenges for face detection, since effective models tend to be computationally prohibitive. To address these two conflicting challenges, our core idea is to shrink the input image and focus on detecting small faces. Reducing the image resolution can significantly improve the detection speed, but it also...
For object detection, the two-stage approach (e.g., Faster R-CNN) has been achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has the advantage of high efficiency. To inherit the merits of both while overcoming their disadvantages, in this paper, we propose a novel single-shot based detector, called RefineDet, that achieves...
Accuracy and efficiency are two conflicting challenges for face detection, since effective models tend to be computationally prohibitive. To address these two conflicting challenges, our core idea is to shrink the input image and focus on detecting small faces. Specifically, we propose a novel face detector, dubbed the name Densely Connected Face P...
This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S³FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchorbased detectors deteriorate dramatically as the objects become smaller....
Although tremendous strides have been made in face detection, one of the remaining open challenges is to achieve real-time speed on the CPU as well as maintain high performance, since effective models for face detection tend to be computationally prohibitive. To address this challenge, we propose a novel face detector, named FaceBoxes, with superio...
This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S$^3$FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchor-based detectors deteriorate dramatically as the objects become smal...
Although tremendous strides have been made in face detection, one of the remaining open challenges is to achieve real-time speed on the CPU as well as maintain high performance, since effective models for face detection tend to be computationally prohibitive. To address this challenge, we propose a novel face detector, named FaceBoxes, with superio...
A variety of encoding methods for bag of word (BoW) model have been proposed to encode the local features in image classification. However, most of them are unsupervised and just employ k-means to form the visual vocabulary, thus reducing the discriminative power of the features. In this paper, we propose a metric embedded discriminative vocabulary...
A variety of encoding methods for bag of word (BoW) model have been proposed to encode the local features in image classification. However, most of them are unsupervised and just employ k-means to form the visual vocabulary, thus reducing the discriminative power of the features. In this paper, we propose a metric embedded discriminative vocabulary...