Ran He

Chinese Academy of Sciences, Peping, Beijing, China

Are you Ran He?

Claim your profile

Publications (51)31.98 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Several methods have been proposed to describe face images in order to recognize them automatically. Local methods based on spatial histograms of local patterns (or operators) are among the best-performing ones. In this paper, a new method that allows to obtain more robust histograms of local patterns by using a more discriminative spatial division strategy is proposed. Spatial histograms are obtained from regions clustered according to the semantic pixel relations, making better use of the spatial information. Here, a simple rule is used, in which pixels in an image patch are clustered by sorting their intensity values. By exploring the information entropy on image patches, the number of sets on each of them is learned. Besides, Principal Component Analysis with a Whitening process is applied for the final feature vector dimension reduction, making the representation more compact and discriminative. The proposed division strategy is invariant to monotonic grayscale changes, and shows to be particularly useful when there are large expression variations on the faces. The method is evaluated on three widely used face recognition databases: AR, FERET and LFW, with the very popular LBP operator and some of its extensions. Experimental results show that the proposal not only outperforms those methods that use the same local patterns with the traditional division, but also some of the best-performing state-of-the-art methods.
    EURASIP Journal on Image and Video Processing 05/2014; 2014(26). DOI:10.1186/1687-5281-2014-26 · 0.66 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Robust sparse representation has shown significant potential in solving challenging problems in computer vision such as biometrics and visual surveillance. Although several robust sparse models have been proposed and promising results have been obtained, they are either for error correction or for error detection, and learning a general framework that systematically unifies these two aspects and explores their relation is still an open problem. In this paper, we develop a half-quadratic (HQ) framework to solve the robust sparse representation problem. By defining different kinds of half-quadratic functions, the proposed HQ framework is applicable to performing both error correction and error detection. More specifically, by using the additive form of HQ, we propose an $(\ell_1)$-regularized error correction method by iteratively recovering corrupted data from errors incurred by noises and outliers; by using the multiplicative form of HQ, we propose an $(\ell_1)$-regularized error detection method by learning from uncorrupted data iteratively. We also show that the $(\ell_1)$-regularization solved by soft-thresholding function has a dual relationship to Huber M-estimator, which theoretically guarantees the performance of robust sparse representation in terms of M-estimation. Experiments on robust face recognition under severe occlusion and corruption validate our framework and findings.
    IEEE Transactions on Software Engineering 02/2014; 36(2):261-75. DOI:10.1109/TPAMI.2013.102 · 2.29 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Subspace clustering has important and wide applications in computer vision and pattern recognition. It is a challenging task to learn low-dimensional subspace structures due to the possible errors (e.g., noise and corruptions) existing in high-dimensional data. Recent subspace clustering methods usually assume a sparse representation of corrupted errors and correct the errors iteratively. However large corruptions in real-world applications can not be well addressed by these methods. A novel optimization model for robust subspace clustering is proposed in this paper. The objective function of our model mainly includes two parts. The first part aims to achieve a sparse representation of each high-dimensional data point with other data points. The second part aims to maximize the correntropy between a given data point and its low-dimensional representation with other points. Correntropy is a robust measure so that the influence of large corruptions on subspace clustering can be greatly suppressed. An extension of our method with explicit introduction of representation error terms into the model is also proposed. Half-quadratic minimization is provided as an efficient solution to the proposed robust subspace clustering formulations. Experimental results on Hopkins 155 dataset and Extended Yale Database B demonstrate that our method outperforms state-of-the-art subspace clustering methods.
    Proceedings of the 2013 IEEE International Conference on Computer Vision; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cross-modal matching has recently drawn much attention due to the widespread existence of multimodal data. It aims to match data from different modalities, and generally involves two basic problems: the measure of relevance and coupled feature selection. Most previous works mainly focus on solving the first problem. In this paper, we propose a novel coupled linear regression framework to deal with both problems. Our method learns two projection matrices to map multimodal data into a common feature space, in which cross-modal data matching can be performed. And in the learning procedure, the ell_21-norm penalties are imposed on the two projection matrices separately, which leads to select relevant and discriminative features from coupled feature spaces simultaneously. A trace norm is further imposed on the projected data as a low-rank constraint, which enhances the relevance of different modal data with connections. We also present an iterative algorithm based on half-quadratic minimization to solve the proposed regularized linear regression problem. The experimental results on two challenging cross-modal datasets demonstrate that the proposed method outperforms the state-of-the-art approaches.
    Proceedings of the 2013 IEEE International Conference on Computer Vision; 12/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Great progress has been achieved in face recognition in the last three decades. However, it is still challenging to characterize the identity related features in face images. This paper proposes a novel facial feature extraction method named Gabor Ordinal Measures (GOM), which integrates the distinctiveness of Gabor features and the robustness of ordinal measures as a promising solution to jointly handle inter-person similarity and intra-person variations in face images. In the proposal, different kinds of ordinal measures are derived from magnitude, phase, real and imaginary components of Gabor images, respectively, and then are jointly encoded as visual primitives in local regions. The statistical distributions of these visual primitives in face image blocks are concatenated into a feature vector and linear discriminant analysis is further used to obtain a compact and discriminative feature representation. Finally, a two-stage cascade learning method and a greedy block selection method are used to train a strong classifier for face recognition. Extensive experiments on publicly available face image databases such as FERET, AR and large scale FRGC v2.0 demonstrate state-of-the-art face recognition performance of GOM.
    IEEE Transactions on Information Forensics and Security 11/2013; PP(99). DOI:10.1109/TIFS.2013.2290064 · 2.07 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper investigates the problem of cross-modal retrieval, where users can search results across various modalities by submitting any modality of query. Since the query and its retrieved results can be of different modalities, how to measure the content similarity between different modalities of data remains a challenge. To address this problem, we propose a joint graph regularized multi-modal subspace learning (JGRMSL) algorithm, which integrates inter-modality similarities and intra-modality similarities into a joint graph regularization to better explore the cross-modal correlation and the local manifold structure in each modality of data. To obtain good class separation, the idea of Linear Discriminant Analysis (LDA) is incorporated into the proposed method by maximizing the between-class covariance of all projected data and minimizing the within-class covariance of all projected data. Experimental results on two public cross-modal datasets demonstrate the effectiveness of our algorithm.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Subspace clustering via Low-Rank Representation (LRR) has shown its effectiveness in clustering the data points sampled from a union of multiple subspaces. In original LRR, the noise in data is assumed to be Gaussian or sparse, which may be inappropriate in real-world scenarios, especially when the data is densely corrupted. In this paper, we aim to improve the robustness of LRR in the presence of large corruptions and outliers. First, we propose a robust LRR method by introducing the correntropy loss function. Second, a column-wise correntropy loss function is proposed to handle the sample-specific errors in data. Furthermore, an iterative algorithm based on half-quadratic optimization is developed to solve the proposed methods. Experimental results on Hopkins 155 dataset and Extended Yale Database B show that our methods can further improve the robustness of LRR and outperform other subspace clustering methods.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Both image alignment and image clustering are widely researched with numerous applications in recent years. These two problems are traditionally studied separately. However in many real world applications, both alignment and clustering results are needed. Recent study has shown that alignment and clustering are two highly coupled problems. Thus we try to solve the two problems in a unified framework. In this paper, we propose a novel joint alignment and clustering algorithm by integrating spatial transformation parameters and clustering parameters into a unified objective function. The proposed function seeks the lowest rank representation among all the candidates that can represent misaligned images. It is indeed a transformed Low-Rank Representation. As far as we know, this is the first time to cluster the misaligned images using the transformed Low-Rank Representation. We can solve the proposed function by linear zing the objective function, and then iteratively solving a sequence of linear problems via the Augmented Lagrange Multipliers method. Experimental results on various data sets validate the effectiveness of our method.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: The wide deployments of iris recognition systems promote the emergence of different types of iris sensors. Large differences such as illumination wavelength and resolution result in cross-sensor variations of iris texture patterns. These variations decrease the accuracy of iris recognition. To address this issue, a feasible solution is to select an optimal effective feature set for all types of iris sensors. In this paper, we propose a margin based feature selection method for cross-sensor iris recognition. This method learns coupled feature weighting factors by minimizing a cost function, which aims at selecting the feature set to represent the intrinsic characteristics of iris images from different sensors. Then, the optimization problem can be formulated and solved using linear programming. Extensive experiments on the Notre Dame Cross Sensor Iris Database and CASIA cross sensor iris database show that the proposed method outperforms conventional feature selection methods in cross-sensor iris recognition.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Spectral regression has been an efficient and powerful tool for face recognition. However, spectral regression is sensitive to the errors incurred by inaccurate annotation and occlusion. This paper studies robust spectral regression based discriminant subspace learning from correntropy and spatially smooth structure of facial subspace. First, we formulate the robust discriminant subspace learning problem as a maximum correntropy problem, which finds the most correlation solution between spectral targets and predictions. Second, total variation (TV) regularization is imposed on the correntropy objective to learn a spatially smooth face structure. Lastly, based on the additive form of half-quadratic optimization, we cast the maximum correntropy problem into a compound regularization model, which can be efficiently optimized via an accelerated proximal gradient algorithm. Compared with iteratively reweighted least squares based methods, the proposed method can not only improve recognition rates but also reduce computational cost. Experimental results on a couple of face recognition datasets demonstrate the robustness and effectiveness of our method against inaccurate annotation and occlusion.
    Neurocomputing 10/2013; 118:33-40. DOI:10.1016/j.neucom.2013.02.011 · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Low-rank matrix recovery algorithms aim to recover a corrupted low-rank matrix with sparse errors. However, corrupted errors may not be sparse in real-world problems and the relationship between L1 regularizer on noise and robust M-estimators is still unknown. This paper proposes a general robust framework for low-rank matrix recovery via implicit regularizers of robust M-estimators, which are derived from convex conjugacy and can be used to model arbitrarily corrupted errors. Based on the additive form of half-quadratic optimization, proximity operators of implicit regularizers are developed such that both low-rank structure and corrupted errors can be alternately recovered. In particular, the dual relationship between the absolute function in L1 regularizer and Huber M-estimator is studied, which establishes a relationship between robust low-rank matrix recovery methods and M-estimators based robust principal component analysis methods. Extensive experiments on synthetic and real-world datasets corroborate our claims and verify the robustness of the proposed framework.
    IEEE Transactions on Software Engineering 09/2013; 36(4). DOI:10.1109/TPAMI.2013.188 · 2.29 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Sparse recovery aims to find the sparsest solution of an underdetermined system Xβ=yXβ=y. This paper studies simple yet efficient sparse recovery algorithms from a novel viewpoint of convex conjugacy. To this end, we induce a family of convex conjugated loss functions as a smooth approximation of l0-norml0-norm. Then we apply the additive form of half-quadratic (HQ) optimization to solve these loss functions and to reformulate the sparse recovery problem as an augmented quadratic constraint problem that can be efficiently computed by alternate minimization. At each iteration, we compute the auxiliary vector of HQ via minimizer function and then we project this vector into the nullspace of the homogeneous linear system Xβ=0Xβ=0 such that a feasible and sparser solution is obtained. Extensive experiments on random sparse signals and robust face recognition corroborate our claims and validate that our method outperforms the state-of-the-art l1 minimization algorithms in terms of computational cost and estimation error.
    Neurocomputing 09/2013; 115:178–185. DOI:10.1016/j.neucom.2012.12.034 · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: It is necessary to match heterogeneous iris images captured by different types of iris sensors with an increasing demand of interoperable identity management systems. The significant differences among multiple types of iris sensors such as optical lens and illumination wavelength determine the cross-sensor variations of iris texture patterns. Therefore it is a challenging problem to select the common feature set which is effective for all types of iris sensors. This paper proposes a novel optimization model of coupled feature selection for cross-sensor iris recognition. The objective function of our model includes two parts: the first part aims to minimize the misclassification errors; the second part is designed to achieve sparsity in coupled feature spaces based on l2,1-norm regularization. In the training stage, the proposed feature selection model can be formulated as a half-quadratic optimization problem, where an iterative algorithm is developed to obtain the solution. Experimental results on the Notre Dame Cross Sensor Iris Database and CASIA cross sensor iris database show that features selected by the proposed method perform better than those selected by conventional single-space feature selection methods such as Boosting and h regularization methods.
    2013 IEEE 6th International Conference on Biometrics: Theory, Applications and Systems (BTAS); 09/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel nonnegative sparse representation approach, called two-stage sparse representation (TSR), for robust face recognition on a large-scale database. Based on the divide and conquer strategy, TSR decompos 1f0 es the procedure of robust face recognition into outlier detection stage and recognition stage. In the first stage, we propose a general multisubspace framework to learn a robust metric in which noise and outliers in image pixels are detected. Potential loss functions, including L1 , L2,1, and correntropy are studied. In the second stage, based on the learned metric and collaborative representation, we propose an efficient nonnegative sparse representation 64c algorithm to find an approximation solution of sparse representation. According to the L1 ball theory in sparse representation, the approximated solution is unique and can be optimized efficiently. Then a filtering strategy is developed to avoid the computation of the sparse representation on the whole large-scale dataset. Moreover, theoretical analysis also gives the necessary condition for nonnegative least squares technique to find a sparse solution. Extensive experiments on several public databases have demonstrated that the proposed TSR approach, in general, achieves better classification accuracy than the state-of-the-art sparse representation methods. More importantly, a significant reduction of computational costs is reached in comparison with sparse representation classifier; this enables the TSR to be more suitable for robust face recognition on a large-scale dataset.
    IEEE transactions on neural networks and learning systems 01/2013; 24(1):35-46. DOI:10.1109/TNNLS.2012.2226471 · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Active shape model (ASM), as a method for extracting and representing object shapes, has received considerable attention in recent years. In ASM, a shape is represented statistically by a set of well-defined landmark points and its variations are modeled by the principal component analysis (PCA). However, we find that both PCA and Procrustes analysis are sensitive to noise, and there is a linear relationship between alignment error and magnitude of noise, which leads parameter estimation to be ill-posed. In this paper, we present a sparse ASM based on l1-minimization for shape alignment, which can automatically select an effective group of principal components to represent a given shape. A noisy item is introduced to both shape parameter and pose parameter (scale, translation, and rotation), and the parameter estimation is solved by the l1-minimization framework. The estimation of these two kinds of parameters is independent and robust to local noise. Experiments on face dataset validate robustness and effectiveness of the proposed technique.
    Proceedings of the 7th Chinese conference on Biometric Recognition; 12/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Feature extraction plays an important role in face recognition. Based on local binary patterns (LBP), we propose a novel face representation method which obtains histograms of semantic pixel sets based LBP (spsLBP) with a robust code voting (rcv). By clustering according the semantic pixel relations before the histogram estimation, the spsLBP makes better use of the spatial information over the original LBP. In this paper, we use a simple rule to use the semantic information. We cluster by the pixel intensity values, which is also invariant to monotonic grayscale changes, and it is in particular very useful when there are occlusions and expression variations on face images. Besides, the proposed representation adopts a new code voting strategy for LBP histogram computation, which makes it more robust. The proposed method is evaluated on three widely used face recognition databases: AR, FERET and LFW. Experimental results show that the proposed method can outperform the original uniform LBP and its extensions.
    Proceedings of the 11th Asian conference on Computer Vision - Volume Part II; 11/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Non-negativity matrix factorization (NMF) and its variants have been explored in the last decade and are still attractive due to its ability of extracting non-negative basis images. However, most existing NMF based methods are not ready for encoding higher-order data information. One reason is that they do not directly/explicitly model structured data information during learning, and therefore the extracted basis images may not completely describe the “parts” in an image [1] very well. In order to solve this problem, the structured sparse NMF has been recently proposed in order to learn structured basis images. It however depends on some special prior knowledge, i.e. one needs to exhaustively define a set of structured patterns in advance. In this paper, we wish to perform structured sparsity learning as automatically as possible. To that end, we propose a pixel dispersion penalty (PDP), which effectively describes the spatial dispersion of pixels in an image without using any manually predefined structured patterns as constraints. In PDP, we consider each part-based feature pattern of an image as a cluster of non-zero pixels; that is the non-zero pixels of a local pattern should be spatially close to each other. Furthermore, by incorporating the proposed PDP, we develop a spatial non-negative matrix factorization (Spatial NMF) and a spatial non-negative component analysis (Spatial NCA). In Spatial NCA, the non-negativity constraint is only imposed on basis images and such constraint on coefficients is released, so both subtractive and additive combinations of non-negative basis images are allowed for reconstructing any images. Extensive experiments are conducted to validate the effectiveness of the proposed pixel dispersion penalty. We also experimentally show that Spatial NCA is more flexible for extracting non-negative basis images and obtains better and more stable performance.
    Pattern Recognition 08/2012; 45(8):2912–2926. DOI:10.1016/j.patcog.2012.01.022 · 2.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work presents a systematic study of objective evaluations of abstaining classifications using information-theoretic measures (ITMs). First, we define objective measures as the ones which do not depend on any free parameter. According to this definition, technical simplicity for examining “objectivity” or “subjectivity” is directly provided for classification evaluations. Second, we propose 24 normalized ITMs for investigation, which are derived from either mutual information, divergence, or cross-entropy. Contrary to conventional performance measures that apply empirical formulas based on users′ intuitions or preferences, the ITMs are theoretically more general for realizing objective evaluations of classifications. They are able to distinguish “error types” and “reject types” in binary classifications without the need to inputting data of cost terms. Third, to better understand and select the ITMs, we suggest three desirable features for classification assessment measures, which appear more crucial and appealing from the viewpoint of classification applications. Using these features as “meta-measures”, we can reveal the advantages and limitations of ITMs from a higher level of evaluation knowledge. Numerical examples are given to demonstrate our claims and compare the differences among the proposed measures. The best measure is selected in terms of the meta-measures, and its specific properties regarding error types and reject types are analytically derived.
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a new image representation method named Histograms of Gabor Ordinal Measures (HOGOM) for robust face recognition. First, a novel texture descriptor, Gabor Ordinal Measures (GOM), is developed to inherit the advantages from Gabor features and Ordinal Measures. GOM applies Gabor filters of different orientations and scales on the face image and then computes Ordinal Measures over each Gabor magnitude response. Second, in order to obtain an effective and compact representation, the binary values of each GOM, for different orientations at a given scale, are encoded into a single decimal number and then spatial histograms of non-overlapping rectangular regions are computed. Finally, a nearest-neighbor classifier with the χ2 dissimilarity measure is used for classification. HOGOM has three principal advantages: 1) it succeeds the spatial locality and orientation selectivity from Gabor features; 2) the adopted region-comparison strategy makes it more robust; 3) by applying the binary codification and computing spatial histograms, it becomes more stable and efficient. Extensive experiments on the large-scale FERET database and AR database show the robustness of the proposed descriptor, achieving the state of the art.
    International Conference on Biometrics; 03/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mean-Shift (MS) is a powerful nonparametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets. In this paper, for the purpose of algorithmic speedup, we develop an agglomerative MS clustering method along with its performance analysis. Our method, namely Agglo-MS, is built upon an iterative query set compression mechanism which is motivated by the quadratic bounding optimization nature of MS algorithm. The whole framework can be efficiently implemented in linear running time complexity. We then extend Agglo-MS into an incremental version which performs comparably to its batch counterpart. The efficiency and accuracy of Agglo-MS are demonstrated by extensive comparing experiments on synthetic and real data sets.
    IEEE Transactions on Knowledge and Data Engineering 03/2012; 24(2-24):209 - 219. DOI:10.1109/TKDE.2010.232 · 1.82 Impact Factor

Publication Stats

483 Citations
31.98 Total Impact Points

Institutions

  • 2006–2012
    • Chinese Academy of Sciences
      • • Institute of Automation
      • • National Pattern Recognition Laboratory
      Peping, Beijing, China
  • 2010–2011
    • Dalian University of Technology
      • School of Electronic and Information Engineering
      Dalian, Liaoning, China