-
C McCool,
S Marcel,
H Abdenour,
M Pietikainen,
P Matejka,
J Cernocky,
N Poh, J Kittler,
A Larcher,
C Levy,
D Matrouf,
J-F Bonastre,
P Tresadern,
T Cootes
IEEE ICME Workshop on Hot Topics in Mobile Multimedia; 07/2012
-
[show abstract]
[hide abstract]
ABSTRACT: This paper proposes a unified framework for quality-based fusion of multimodal biometrics. Quality-dependent fusion algorithms aim to dynamically combine several classifier (biometric expert) outputs as a function of automatically derived (biometric) sample quality. Quality measures used for this purpose quantify the degree of conformance of biometric samples to some predefined criteria known to influence the system performance. Designing a fusion classifier to take quality into consideration is difficult because quality measures cannot be used to distinguish genuine users from impostors, i.e., they are nondiscriminative yet still useful for classification. We propose a general Bayesian framework that can utilize the quality information effectively. We show that this framework encompasses several recently proposed quality-based fusion algorithms in the literature-Nandakumar et al., 2006; Poh et al., 2007; Kryszczuk and Drygajo, 2007; Kittler et al., 2007; Alonso-Fernandez, 2008; Maurer and Baker, 2007; Poh et al., 2010. Furthermore, thanks to the systematic study concluded herein, we also develop two alternative formulations of the problem, leading to more efficient implementation (with fewer parameters) and achieving performance comparable to, or better than, the state of the art. Last but not least, the framework also improves the understanding of the role of quality in multiple classifier combination.
IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2011; 34:3-18. · 4.91 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The lip-region can be interpreted as either a genetic or behavioural biometric trait depending on whether static or dynamic information is used. In this paper, we use a texture descriptor called Local Ordinal Contrast Pattern (LOCP) in conjunction with a novel spatiotemporal sampling method called Windowed Three Orthogonal Planes (WTOP) to represent both appearance and dynamics features ob served in visual speech. This representation, with standard speaker verification engines, is shown to improve the performance of the lip biometric trait compared to the state-of-the-art. The improvement obtained suggests that there is enough discriminative information in the mouth-region to enable its use as a primary biometric as opposed to a "soft" biometric trait.
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on; 06/2011 · 4.63 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Bags-of-visual-Words (BoW) and Spatio-Temporal Shapes (STS) are two very popular approaches for action recognition from video. The former (BoW) is an un-structured global representation of videos which is built using a large set of local features. The latter (STS) uses a single feature located on a region of interest (where the actor is) in the video. Despite the popularity of these methods, no comparison between them has been done. Also, given that BoW and STS differ intrinsically in terms of context inclusion and globality/locality of operation, an appropriate evaluation framework has to be designed carefully. This paper compares these two approaches using four different datasets with varied degree of space-time specificity of the actions and varied relevance of the contextual background. We use the same local feature extraction method and the same classifier for both approaches. Further to BoW and STS, we also evaluated novel variations of BoW constrained in time or space. We observe that the STS approach leads to better results in all datasets whose background is of little relevance to action classification.
Applications of Computer Vision (WACV), 2011 IEEE Workshop on; 02/2011
-
N. Poh,
Chi Ho Chan, J. Kittler,
S. Marcel,
C. McCool,
E.A. Rúa,
J.L.A. Castro,
M. Villegas,
R. Paredes,
V. Štruc,
N. Pavešić,
A.A. Salah,
Hui Fang,
N. Costen
[show abstract]
[hide abstract]
ABSTRACT: Person recognition using facial features, e.g., mug-shot images, has long been used in identity documents. However, due to the widespread use of web-cams and mobile devices embedded with a camera, it is now possible to realize facial video recognition, rather than resorting to just still images. In fact, facial video recognition offers many advantages over still image recognition; these include the potential of boosting the system accuracy and deterring spoof attacks. This paper presents an evaluation of person identity verification using facial video data, organized in conjunction with the International Conference on Biometrics (ICB 2009). It involves 18 systems submitted by seven academic institutes. These systems provide for a diverse set of assumptions, including feature representation and preprocessing variations, allowing us to assess the effect of adverse conditions, usage of quality information, query selection, and template construction for video-to-video face authentication.
IEEE Transactions on Information Forensics and Security 01/2011; · 1.34 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Visual concept detection is one of the most important tasks in image and video indexing. This paper describes our system in
the ImageCLEF@ICPR Visual Concept Detection Task which ranked first for large-scale visual concept detection tasks in terms of Equal Error Rate (EER) and Area under Curve (AUC) and ranked third in terms of hierarchical measure. The presented approach involves state-of-the-art local descriptor computation, vector quantisation
via clustering, structured scene or object representation via localised histograms of vector codes, similarity measure for
kernel construction and classifier learning. The main novelty is the classifier-level and kernel-level fusion using Kernel
Discriminant Analysis with RBF/Power Chi-Squared kernels obtained from various image descriptors. For 32 out of 53 individual
concepts, we obtain the best performance of all 12 submissions to this task.
12/2010: pages 162-170;
-
[show abstract]
[hide abstract]
ABSTRACT: A key prerequisite of automatic video indexing and summarisation is the description of events and actions. In the context of many sports, the motion of the ball and agents plays an essential role in describing events. However, the only existing solution for the tennis event recognition problem in the literature is the work in which relies on a set of heuristic rules such as proximity between ball and players or court lines to classify ball event candidates. We present hidden Markov models (HMMs) paradigm to automatically learn to identify events from ball trajectories and demonstrate that its ability to capture the dynamics of the ball movement lead to a much higher performance.
Image Processing (ICIP), 2010 17th IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Cohort models are non-match models available in a biometric system. They could be other enrolled models in the gallery of the system. Cohort models have been widely used in biometric systems. A well-established scheme such as T-norm exploits cohort models to predict the statistical parameters of non-match scores for biometric authentication. They have also been used to predict failure or recognition performance of biometric system. In this paper we show that cohort models that are sorted by their similarity to the claimed target model, can produce a discriminative score pattern. We also show that polynomial regression can be used to extract discriminative parameters from these patterns. These parameters can be combined with the raw score to improve the recognition performance of an authentication system. The experimental results obtained for the face and fingerprint modalities of the Biosecure database validate this claim.
Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: The lip-region can be interpreted as either a genetic or behavioral biometric trait depending on whether static or dynamic information is used. Despite this breadth of possible application as a biometric, lip-based biometric systems are scarcely developed in scientific literature compared to other more popular traits such as face or voice. This is because of the generalized view of the research community about the lack of discriminative power in the lip region. In this paper, we propose a new method of texture representation called Local Ordinal Contrast Pattern (LOCP) for use in the representation of both appearance and dynamics features observed within a given lip-region during speech production. The use of this new feature representation, in conjunction with some standard speaker verification engines based on Linear Discriminant Analysis and Histogram-distance based methods, is shown to drastically improve the performance of the lip-biometric trait compared to the existing state-of-the-art methods. The best, reported state-of-the-art performance was an HTER of 13.35% for the XM2VTS database. We obtained HTER of less than 1%. The improvement obtained is remarkable and suggests that there is enough discriminative information in the mouth-region to enable its use as a primary biométrie modality as opposed to a “soft” biométrie trait as has been done in previous research.
Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: The application of biometric technology has so far been top-down, driven by governments and law enforcement agencies. The low demand of this technology from the public, despite its many advantages compared to the traditional means of authentication is probably due to the lack of human factor considerations in the design process. In this work, we propose a guideline to design an interactive quality-driven feedback mechanism. The mechanism aims to improve the quality of biométrie samples during the acquisition process by putting in place objective assessment of the quality and feeding this information back to the user instantaneously, thus eliminating subjective quality judgement by the user. We illustrate the feasibility of the design methodology using face recognition as a case study. Preliminary results show that the methodology can potentially increase efficiency, effectiveness and accessibility of a biométrie system.
Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: This paper presents a graphical model for de-formable face matching and landmark localization under an unknown non-rigid warp. The proposed model learns and combines statistics of both appearance and shape variations of facial images (learnt purely from a set of frontal training images) in a complex objective function in an unsupervised manner. Local and global shape variations are included in the objective function as binary and higher order clique potentials. The proposed approach exploits the sparseness of facial features to reduce the complexity of inference over the probabilistic model. Besides presenting a method for face feature localization, the paper proposes a framework for incorporation of statistical shape priors as higher order cliques into MRFs. The problem of optimizing the objective function is performed using the dual decomposition approach in which the higher order subproblems based on point distribution models are formulated as instances of convex quadratic programs. The evaluation of the approach for feature localization is performed both on the frontal and rotated images of the XM2VTS dataset images as well as images collected from Google. The method shows high robustness to partial occlusion, pose changes etc. The method is then applied as an initialization step for a more costly matching method and is shown to be instrumental in improving performance and reducing runtime.
Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on; 10/2010
-
[show abstract]
[hide abstract]
ABSTRACT: We review a multiple kernel learning (MKL) technique called ℓ<sub>p</sub> regularised multiple kernel Fisher discriminant analysis (MK-FDA), and investigate the effect of feature space denoising on MKL. Experiments show that with both the original kernels or denoised kernels, ℓ<sub>p</sub> MK-FDA outperforms its fixed-norm counterparts. Experiments also show that feature space denoising boosts the performance of both single kernel FDA and ℓ<sub>p</sub> MK-FDA, and that there is a positive correlation between the learnt kernel weights and the amount of variance kept by feature space denoising. Based on these observations, we argue that in the case where the base feature spaces are noisy, linear combination of kernels cannot be optimal. An MKL objective function which can take care of feature space denoising automatically, and which can learn a truly optimal (non-linear) combination of the base kernels, is yet to be found.
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on; 08/2010
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we generalise multiple kernel Fisher discriminant analysis (MK-FDA) such that the kernel weights can be regularised with an ℓ<sub>p</sub> norm for any p ≥ 1, in contrast to existing MK-FDA that uses either l1 or l2 norm. We present formulations for both binary and multiclass cases and solve the associated optimisation problems efficiently with semi-infinite programming. We show on three object and image categorisation benchmarks that by learning the intrinsic sparsity of a given set of base kernels using a validation set, the proposed ℓ<sub>p</sub> MK-FDA outperforms its fixed-norm counterparts, and is capable of producing state-of-the-art performance. Moreover, we show that our ℓ<sub>p</sub> MK-FDA outperforms the ℓ<sub>p</sub> multiple kernel support vector machine (ℓ<sub>p</sub> MK-SVM) which has been recently proposed. Based on this observation and our experience with single kernel FDA and SVM, we argue that the almost century-old FDA is still a strong competitor of the popular SVM.
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on; 07/2010
-
[show abstract]
[hide abstract]
ABSTRACT: The problem of biometric menagerie, first pointed out by Doddington et al. (1998), is one that plagues all biometric systems. They observe that only a handful of clients (enrolled users in the gallery) actually contribute disproportionately to recognition errors. While prior literature attempting to reduce this effect focuses on either client-specific score normalization or client-specific decision strategies, in this study, we explore a novel category of approaches: group-specific score normalization. While client-specific score normalization can be negatively impacted by the paucity of genuine score samples, group-specific score normalization is less affected since the matching score samples of different clients belonging to the same group are aggregated. Experimental evidence based on face, fingerprint and iris modalities show that our proposal generally outperforms client-specific score normalization as well as the baseline systems (without any normalization) across all possible operating points (so obtained by changing the decision threshold).
Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on; 07/2010
-
[show abstract]
[hide abstract]
ABSTRACT: As mobile devices are becoming more ubiquitous, it is now possible to enhance the security of the phone, as well as remote services requiring identity verification, by means of biometric traits such as fingerprint and speech. We refer to this as mobile biometry. The objective of this study is to increase the usability of mobile biometry for visually impaired users, using face as biometrics. We illustrate a scenario of a person capturing his/her own face images which are as frontal as possible. This is a challenging task for the following reasons. Firstly, a greater variation in head pose and degradation in image quality (e.g., blur, de-focus) is expected due to the motion introduced by the hand manipulation and unsteadiness. Second, for the visually impaired users, there currently exists no mechanism to provide feedback on whether a frontal face image is detected. In this paper, an audio feedback mechanism is proposed to assist the visually impaired to acquire face images of better quality. A preliminary user study suggests that the proposed audio feedback can potentially (a) shorten the acquisition time and (b) improve the success rate of face detection, especially for the non-sighted users.
Human System Interactions (HSI), 2010 3rd Conference on; 06/2010
-
[show abstract]
[hide abstract]
ABSTRACT: As biometric technology is rolled out on a larger scale, it will be a common scenario (known as cross-device matching) to have a template acquired by one biometric device used by another during testing. This requires a biometric system to work with different acquisition devices, an issue known as device interoperability. We further distinguish two subproblems, depending on whether the device identity is known or unknown. In the latter case, we show that the device information can be probabilistically inferred given quality measures (e.g., image resolution) derived from the raw biometric data. By keeping the template unchanged, cross-device matching can result in significant degradation in performance. We propose to minimize this degradation by using device-specific quality-dependent score normalization. In the context of fusion, after having normalized each device output independently, these outputs can be combined using the naive Bayes principal. We have compared and categorized several state-of-the-art quality-based score normalization procedures, depending on how the relationship between quality measures and score is modeled, as follows: 1) direct modeling; 2) modeling via the cluster index of quality measures; and 3) extending 2) to further include the device information (device-specific cluster index). Experimental results carried out on the Biosecure DS2 data set show that the last approach can reduce both false acceptance and false rejection rates simultaneously. Furthermore, the compounded effect of normalizing each system individually in multimodal fusion is a significant improvement in performance over the baseline fusion (without using any quality information) when the device information is given.
IEEE Transactions on Systems Man and Cybernetics - Part A Systems and Humans 06/2010; · 2.12 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Visual category recognition (VCR) is one of the most important tasks in image and video indexing. Spectral methods have recently emerged as a powerful tool for dimensionality reduction and manifold learning. Recently, Spectral Regression combined with Kernel Discriminant Analysis (SR-KDA) has been successful in many classification problems. In this paper, we adopt this solution to VCR and demonstrate its advantages over existing methods both in terms of speed and accuracy. The distinctiveness of this method is assessed experimentally using an image and a video benchmark: the PASCAL VOC Challenge 08 and the Mediamill Challenge. From the experimental results, it can be derived that SR-KDA consistently yields significant performance gains when compared with the state-of-the art methods. The other strong point of using SR-KDA is that the time complexity scales linearly with respect to the number of concepts and the main computational complexity is independent of the number of categories.
Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on; 11/2009
-
[show abstract]
[hide abstract]
ABSTRACT: In video based face recognition, faces typically experience challenging illumination conditions, blur, or localisation errors in several frames. To alleviate these challenges, quality measures can be used to remove the most severely degraded frames. Still, when the videos are taken in real life settings, degradations are likely to be present even in the highest quality frames, and robust recognition techniques are required. In this paper, a novel discriminative face representation derived by the Linear Discriminant Analysis of (Multiscale) Local Phase Quantisation (LPQ) histogram is proposed. First, a (multiscale) LPQ operator is applied to the face image. Next, histograms are extracted from local regions of resultant images, and projected into an LDA space to form a discriminative regional face descriptor. These methods are implemented and tested on the problem of video based face identification using the BANCA video database. Additionally, to verify their performance, experiments on standard still image FERET and BANCA face databases showing very promising results are reported.
Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on; 11/2009
-
[show abstract]
[hide abstract]
ABSTRACT: This paper describes an example-based Bayesian method for 3D-assisted pose-independent facial texture super-resolution. The method utilizes a 3D morphable model to map facial texture from a 2D face image to a pose- and shape-normalized texture map and vice versa. The center piece of this method is a generative model to describe the process of forming an image from a pose- and shape-normalized texture map. The goal is to reconstruct a high-resolution texture map given an low-resolution face image. The prior knowledge about the sought high-resolution texture is incorporated into the Bayesian framework by using a recognition-based prior that encourages the gradient values of the texture map to be close to some predicted values. We develop the generative model and formulate the problem as MAP estimation. The results show that this framework is capable of performing pose-independent face recognition even when the sample set only contains exemplar face images with frontal pose. We present results in frontal and non-frontal poses. We also demonstrate that the technique can be utilized to improve face recognition results when the probe images have a lower resolution compared to the gallery images.
Biometrics: Theory, Applications, and Systems, 2009. BTAS '09. IEEE 3rd International Conference on; 10/2009
-
[show abstract]
[hide abstract]
ABSTRACT: Cohort-based score normalization as examplified by the T-norm (for test normalization) has been the state-of-the-art approach to account for the variability of signal quality in testing. On the other hand, user-specific score normalization such as the Z-norm and the F-norm, designed to handle variability in performance across different reference models, has also been shown to be very effective. Exploiting the strenghth of both approaches, this paper proposes a novel score normalization called adaptive F-norm, which is client-impostor centric, i.e., utilizing both the genuine and impostor score information, as well as adaptive, i.e, adaptive to the test condition thanks to the use of a pool of cohort models. Experiments based on the biosecure DS2 database which contains 6 fingers of 415 subjects, each acquired using a thermal and an optical device, show that the proposed adaptive F-norm is better or at least as good as the other alternatives, including those recently proposed in the literature.
Biometrics: Theory, Applications, and Systems, 2009. BTAS '09. IEEE 3rd International Conference on; 10/2009