Conference Paper

A face recognition algorithm based on thermal and visible data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Conference Paper
Full-text available
The latest multi-biometric grand challenge (MBGC 2008) sets up a new experiment in which near infrared (NIR) face videos containing partial faces are used as a probe set and the visual (VIS) images of full faces are used as the target set. This is challenging for two reasons: (1) it has to deal with partially occluded faces in the NIR videos, and (2) the matching is between heterogeneous NIR and VIS faces. Partial face matching is also a problem often confronted in many video based face biometric applications. In this paper, we propose a novel approach for solving this challenging problem. For partial face matching, we propose a local patch based method to deal with partial face data. For heterogeneous face matching, we propose the philosophy of enhancing common features in heterogeneous images while reducing differences. This is realized by using edge-enhancing filters, which at the same time is also beneficial for partial face matching. The approach requires neither learning procedures nor training data. Experiments are performed using the MBGC portal challenge data, comparing with several known state-of-the-arts methods. Extensive results show that the proposed approach, without knowing statistical characteristics of the subjects or data, outperforms the methods of contrast significantly, with ten-fold higher verification rates at FAR of 0.1%.
Conference Paper
Full-text available
This paper presents an efficient algorithm for matching sketches with digital face images. The algorithm extracts discriminating information present in local facial regions at different levels of granularity. Both sketches and digital images are decomposed into multi-resolution pyramid to conserve high frequency information which forms the discriminating facial patterns. Extended uniform circular local binary pattern based descriptors use these patterns to form a unique signature of the face image. Further, for matching, a genetic optimization based approach is proposed to find the optimum weights corresponding to each facial region. The information obtained from different levels of Laplacian pyramid are combined to improve the identification accuracy. Experimental results on sketch-digital image pairs from the CUHK and IIIT-D databases show that the proposed algorithm can provide better identification performance compared to existing algorithms.
Conference Paper
Full-text available
Automatic face photo-sketch recognition has important applications for law enforcement. Recent research has focused on transforming photos and sketches into the same modality for matching or developing advanced classification algorithms to reduce the modality gap between features extracted from photos and sketches. In this paper, we propose a new inter-modality face recognition approach by reducing the modality gap at the feature extraction stage. A new face descriptor based on coupled information-theoretic encoding is used to capture discriminative local face structures and to effectively match photos and sketches. Guided by maximizing the mutual information between photos and sketches in the quantized feature spaces, the coupled encoding is achieved by the proposed coupled informationtheoretic projection tree, which is extended to the randomized forest to further boost the performance. We create the largest face sketch database including sketches of 1;194 people fromthe FERET database. Experiments on this large scale dataset show that our approach significantly outperforms the state-of-the-art methods.
Conference Paper
Full-text available
Face recognition algorithms need to deal with variable lighting conditions. Near infrared (NIR) image based face recognition technology has been proposed to effectively overcome this difficulty. However, it requires that the enrolled face images be captured using NIR images whereas many applications require visual (VIS) images for enrollment templates. To take advantage of NIR face images for illumination-invariant face recognition and allow the use of VIS face images for enrollment, we encounter a new face image pattern recognition problem, that is, heterogeneous face matching between NIR versus VIS faces. In this paper, we present a subspace learning framework named Coupled Spectral Regression (CSR) to solve this challenge problem of coupling the two types of face images and matching between them. CSR first models the properties of different types of data separately and then learns two associated projections to project heterogeneous data (e.g. VIS and NIR) respectively into a discriminative common subspace in which classification is finally performed. Compared to other existing methods, CSR is computational efficient, benefiting from the efficiency of spectral regression and has better generalization performance. Experimental results on VIS-NIR face database show that the proposed CSR method significantly outperforms the existing methods.
Conference Paper
Full-text available
Automatic face sketch synthesis has important applications in law enforcement and digital entertainment. Although great progress has been made in recent years, previous methods only work under well controlled conditions and often fail when there are variations of lighting and pose. In this paper, we propose a robust algorithm for synthesizing a face sketch from a face photo taken under a different lighting condition and in a different pose than the training set. It synthesizes local sketch patches using a multiscale Markov Random Field (MRF) model. The robustness to lighting and pose variations is achieved in three steps. Firstly, shape priors specific to facial components are introduced to reduce artifacts and distortions caused by variations of lighting and pose. Secondly, new patch descriptors and metrics which are more robust to lighting variations are used to find candidates of sketch patches given a photo patch. Lastly, a smoothing term measuring both intensity compatibility and gradient compatibility is used to match neighboring sketch patches on the MRF network more effectively. The proposed approach significantly improves the performance of the state-of-the-art method. Its effectiveness is shown through experiments on the CUHK face sketch database and celebrity photos collected from the web.
Conference Paper
Full-text available
This paper addresses the face hallucination problem of converting thermal infrared face images into photo-realistic ones. It is a challenging task because the two modalities are of dramatical difference, which makes many developed linear models inapplicable. We propose a learning-based framework synthesizing the normal face from the infrared input. Compared to the previous work, we further exploit the local linearity in not only the image spatial domain but also the image manifolds. We have also developed a measurement of the variance between an input and its prediction, thus we can apply the Markov random field model to the predicted normal face to improve the hallucination result. Experimental results show the advantage of our algorithm over the existing methods. Our algorithm can be readily generalized to solve other multi-modal image conversion problems as well.
Conference Paper
Full-text available
Matching near-infrared (NIR) face images to visible light (VIS) face images offers a robust approach to face recognition with unconstrained illumination. In this paper we propose a novel method of heterogeneous face recognition that uses a common feature-based representation for both NIR images as well as VIS images. Linear discriminant analysis is performed on a collection of random subspaces to learn discriminative projections. NIR and VIS images are matched (i) directly using the random subspace projections, and (ii) using sparse representation classification. Experimental results demonstrate the effectiveness of the proposed approach for matching NIR and VIS face images.
Article
Full-text available
The problem of matching a forensic sketch to a gallery of mug shot images is addressed in this paper. Previous research in sketch matching only offered solutions to matching highly accurate sketches that were drawn while looking at the subject (viewed sketches). Forensic sketches differ from viewed sketches in that they are drawn by a police sketch artist using the description of the subject provided by an eyewitness. To identify forensic sketches, we present a framework called local feature-based discriminant analysis (LFDA). In LFDA, we individually represent both sketches and photos using SIFT feature descriptors and multiscale local binary patterns (MLBP). Multiple discriminant projections are then used on partitioned vectors of the feature-based representation for minimum distance matching. We apply this method to match a data set of 159 forensic sketches against a mug shot gallery containing 10,159 images. Compared to a leading commercial face recognition system, LFDA offers substantial improvements in matching forensic sketches to the corresponding face images. We were able to further improve the matching performance using race and gender information to reduce the target gallery size. Additional experiments demonstrate that the proposed framework leads to state-of-the-art accuracys when matching viewed sketches.
Article
Full-text available
In this paper, we propose a novel face photo-sketch synthesis and recognition method using a multiscale Markov Random Fields (MRF) model. Our system has three components: 1) given a face photo, synthesizing a sketch drawing; 2) given a face sketch drawing, synthesizing a photo; and 3) searching for face photos in the database based on a query sketch drawn by an artist. It has useful applications for both digital entertainment and law enforcement. We assume that faces to be studied are in a frontal pose, with normal lighting and neutral expression, and have no occlusions. To synthesize sketch/photo images, the face region is divided into overlapping patches for learning. The size of the patches decides the scale of local face structures to be learned. From a training set which contains photo-sketch pairs, the joint photo-sketch model is learned at multiple scales using a multiscale MRF model. By transforming a face photo to a sketch (or transforming a sketch to a photo), the difference between photos and sketches is significantly reduced, thus allowing effective matching between the two in face sketch recognition. After the photo-sketch transformation, in principle, most of the proposed face photo recognition approaches can be applied to face sketch recognition in a straightforward way. Extensive experiments are conducted on a face sketch database including 606 faces, which can be downloaded from our Web site (http://mmlab.ie.cuhk.edu.hk/facesketch.html).
Conference Paper
Full-text available
Most face recognition systems focus on photo-based face recognition. In this paper, we present a face recognition system based on face sketches. The proposed system contains two elements: pseudo-sketch synthesis and sketch recognition. The pseudo-sketch generation method is based on local linear preserving of geometry between photo and sketch images, which is inspired by the idea of locally linear embedding. The nonlinear discriminate analysis is used to recognize the probe sketch from the synthesized pseudo-sketches. Experimental results on over 600 photo-sketch pairs show that the performance of the proposed method is encouraging.
Conference Paper
Recently various algorithms for building of three-dimensional maps of indoor environments have been proposed. In this work we use a Kinect camera that captures RGB images along with depth information for building three-dimensional dense maps of indoor environments. Commonly mapping systems consist of three components; that is, first, spatial alignment of consecutive data frames; second, detection of loop-closures, and finally, globally consistent alignment of the data sequence. It is known that three-dimensional point clouds are well suited for frame-to-frame alignment and for three-dimensional dense reconstruction without the use of valuable visual RGB information. A new fusion algorithm combining visual features and depth information for loop-closure detection followed by pose optimization to build global consistent maps is proposed. The performance of the proposed system in real indoor environments is presented and discussed.
Article
We study the problem of creating a complete model of a physical object. Although this may be possible using intensity images, we here use images which directly provide access to three dimensional information. The first problem that we need to solve is to find the transformation between the different views. Previous approaches either assume this transformation to be known (which is extremely difficult for a complete model), or compute it with feature matching (which is not accurate enough for integration). In this paper, we propose a new approach which works on range data directly, and registers successive views with enough overlapping area to get an accurate transformation between views. This is performed by minimizing a functional which does not require point-to-point matches. We give the details of the registration method and modelling procedure, and illustrate them on real range images of complex objects.
Article
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Book
This book contains the papers presented at the ICB 2009, the 3rd IAPR/IEEE International Conference on Biometrics, hold 2–5 June in Alghero, Italy, and hosted by the Computer Vision Laboratory, University of Sassari.
Conference Paper
Stable local feature detection and representation is a fundamental component of many image registration and object recognition algorithms. Mikolajczyk and Schmid (June 2003) recently evaluated a variety of approaches and identified the SIFT [D. G. Lowe, 1999] algorithm as being the most resistant to common image deformations. This paper examines (and improves upon) the local image descriptor used by SIFT. Like SIFT, our descriptors encode the salient aspects of the image gradient in the feature point's neighborhood; however, instead of using SIFT's smoothed weighted histograms, we apply principal components analysis (PCA) to the normalized gradient patch. Our experiments demonstrate that the PCA-based local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation. We also present results showing that using these descriptors in an image retrieval application results in increased accuracy and faster matching.
Article
Sketch synthesis plays an important role in face sketch-photo recognition system. In this manuscript, an automatic sketch synthesis algorithm is proposed based on embedded hidden Markov model (E-HMM) and selective ensemble strategy. First, the E-HMM is adopted to model the nonlinear relationship between a sketch and its corresponding photo. Then based on several learned models, a series of pseudo-sketches are generated for a given photo. Finally, these pseudo-sketches are fused together with selective ensemble strategy to synthesize a finer face pseudo-sketch. Experimental results illustrate that the proposed algorithm achieves satisfactory effect of sketch synthesis with a small set of face training samples.
Article
Presents a theoretically very simple, yet efficient, multiresolution approach to gray-scale and rotation invariant texture classification based on local binary patterns and nonparametric discrimination of sample and prototype distributions. The method is based on recognizing that certain local binary patterns, termed "uniform," are fundamental properties of local image texture and their occurrence histogram is proven to be a very powerful texture feature. We derive a generalized gray-scale and rotation invariant operator presentation that allows for detecting the "uniform" patterns for any quantization of the angular space and for any spatial resolution and presents a method for combining multiple operators for multiresolution analysis. The proposed approach is very robust in terms of gray-scale variations since the operator is, by definition, invariant against any monotonic transformation of the gray scale. Another advantage is computational simplicity as the operator can be realized with a few operations in a small neighborhood and a lookup table. Experimental results demonstrate that good discrimination can be achieved with the occurrence statistics of simple rotation invariant local binary patterns
Article
Proc. of the International Conference on Computer Vision, Corfu (Sept. 1999) An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest-neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low-residual least-squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially-occluded images with a computation time of under 2 seconds. 1.