Ran He

Chinese Academy of Sciences, Peping, Beijing, China

Are you Ran He?

Claim your profile

Publications (56)49.89 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Subspace clustering has important and wide applications in computer vision and pattern recognition. It is a challenging task to learn low-dimensional subspace structures due to complex noise existing in high-dimensional data. Complex noise has much more complex statistical structures, and is neither Gaussian nor Laplacian noise. Recent subspace clustering methods usually assume a sparse representation of the errors incurred by noise and correct these errors iteratively. However large corruptions incurred by complex noise can not be well addressed by these methods. A novel optimization model for robust subspace clustering is proposed in this paper. Its objective function mainly includes two parts. The first part aims to achieve a sparse representation of each high-dimensional data point with other data points. The second part aims to maximize the correntropy between a given data point and its low-dimensional representation with other points. Correntropy is a robust measure so that the influence of large corruptions on subspace clustering can be greatly suppressed. An extension of pairwise link constraints is also proposed as prior information to deal with complex noise. Half-quadratic minimization is provided as an efficient solution to the proposed robust subspace clustering formulations. Experimental results on three commonly used datasets show that our method outperforms state-of-the-art subspace clustering methods.
    IEEE Transactions on Image Processing 07/2015; DOI:10.1109/TIP.2015.2456504 · 3.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a structured ordinal measure method for video-based face recognition that simultaneously learns ordinal filters and structured ordinal features. The problem is posed as a non-convex integer program problem that includes two parts. The first part learns stable ordinal filters to project video data into a large-margin ordinal space. The second seeks self-correcting and discrete codes by balancing the projected data and a rank-one ordinal matrix in a structured low-rank way. Unsupervised and supervised structures are considered for the ordinal matrix. In addition, as a complement to hierarchical structures, deep feature representations are integrated into our method to enhance coding stability. An alternating minimization method is employed to handle the discrete and low-rank constraints, yielding high-quality codes that capture prior structures well. Experimental results on three commonly used face video databases show that our method with a simple voting classifier can achieve state-of-the-art recognition rates using fewer features and samples.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multi-view clustering, which aims to cluster datasets with multiple sources of information, has a wide range of applications in the communities of data mining and pattern recognition. Generally, it makes use of the complementary information embedded in multiple views to improve clustering performance. Recent methods usually find a low-dimensional embedding of multi-view data, but often ignore some useful prior information that can be utilized to better discover the latent group structure of multi-view data. To alleviate this problem, a novel pairwise sparse subspace representation model for multi-view clustering is proposed in this paper. The objective function of our model mainly includes two parts. The first part aims to harness prior information to achieve a sparse representation of each high-dimensional data point with respect to other data points in the same view. The second part aims to maximize the correlation between the representations of different views. An alternating minimization method is provided as an efficient solution for the proposed multi-view clustering algorithm. A detailed theoretical analysis is also conducted to guarantee the convergence of the proposed method. Moreover, we show that the must-link and cannot-link constraints can be naturally integrated into the proposed model to obtain a link constrained multi-view clustering model. Extensive experiments on five real world datasets demonstrate that the proposed model performs better than several state-of-the-art multi-view clustering methods.
    Neurocomputing 05/2015; 156. DOI:10.1016/j.neucom.2015.01.017 · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: High dimensional dense features have been shown to be useful for face recognition, but result in high query time when searching a large-scale face database. Hence binary codes are often used to obtain fast query speeds as well as reduce storage requirements. However, binary codes for face features can become unstable and unpredictable due to face variations induced by pose, expression and illumination. This paper proposes a predictable hash code algorithm to map face samples in the original feature space to Hamming space. First, we discuss the ‘predictability’ of hash codes for face indexing. Second, we formulate the predictable hash coding problem as a non-convex combinatorial optimization problem, in which the distance between codes for samples from the same class is minimized while the distance between codes for samples from different classes is maximized. An Expectation Maximization method is introduced to iteratively find a sparse and predictable linear mapping. Lastly, a deep feature representation is learned to further enhance the predictability of binary codes. Experimental results on three commonly used face databases demonstrate the superiority of our predictable hash coding algorithm on large-scale problems.
    Pattern Recognition 04/2015; 48(10). DOI:10.1016/j.patcog.2015.03.016 · 2.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint, and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to deal with the pairwise constraint, which can be used as a general platform for developing cross-modal algorithms. For unsupervised learning, we propose a cross-modal subspace clustering method to learn a common structure for different modalities. For supervised learning, to reduce the semantic gap and the outliers in pairwise constraints, we propose a cross-modal matching method based on compound ?21 regularization along with an iteratively reweighted algorithm to find the global optimum. Extensive experiments demonstrate the benefits of joint text and image modeling with semantically induced pairwise constraints, and show that the proposed cross-modal methods can further reduce the semantic gap between different modalities and improve the clustering/retrieval accuracy.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Several methods have been proposed to describe face images in order to recognize them automatically. Local methods based on spatial histograms of local patterns (or operators) are among the best-performing ones. In this paper, a new method that allows to obtain more robust histograms of local patterns by using a more discriminative spatial division strategy is proposed. Spatial histograms are obtained from regions clustered according to the semantic pixel relations, making better use of the spatial information. Here, a simple rule is used, in which pixels in an image patch are clustered by sorting their intensity values. By exploring the information entropy on image patches, the number of sets on each of them is learned. Besides, Principal Component Analysis with a Whitening process is applied for the final feature vector dimension reduction, making the representation more compact and discriminative. The proposed division strategy is invariant to monotonic grayscale changes, and shows to be particularly useful when there are large expression variations on the faces. The method is evaluated on three widely used face recognition databases: AR, FERET and LFW, with the very popular LBP operator and some of its extensions. Experimental results show that the proposal not only outperforms those methods that use the same local patterns with the traditional division, but also some of the best-performing state-of-the-art methods.
    EURASIP Journal on Image and Video Processing 05/2014; 2014(26). DOI:10.1186/1687-5281-2014-26 · 0.66 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Robust sparse representation has shown significant potential in solving challenging problems in computer vision such as biometrics and visual surveillance. Although several robust sparse models have been proposed and promising results have been obtained, they are either for error correction or for error detection, and learning a general framework that systematically unifies these two aspects and explores their relation is still an open problem. In this paper, we develop a half-quadratic (HQ) framework to solve the robust sparse representation problem. By defining different kinds of half-quadratic functions, the proposed HQ framework is applicable to performing both error correction and error detection. More specifically, by using the additive form of HQ, we propose an $(\ell_1)$-regularized error correction method by iteratively recovering corrupted data from errors incurred by noises and outliers; by using the multiplicative form of HQ, we propose an $(\ell_1)$-regularized error detection method by learning from uncorrupted data iteratively. We also show that the $(\ell_1)$-regularization solved by soft-thresholding function has a dual relationship to Huber M-estimator, which theoretically guarantees the performance of robust sparse representation in terms of M-estimation. Experiments on robust face recognition under severe occlusion and corruption validate our framework and findings.
    IEEE Transactions on Software Engineering 02/2014; 36(2):261-75. DOI:10.1109/TPAMI.2013.102 · 5.69 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Subspace clustering has important and wide applications in computer vision and pattern recognition. It is a challenging task to learn low-dimensional subspace structures due to the possible errors (e.g., noise and corruptions) existing in high-dimensional data. Recent subspace clustering methods usually assume a sparse representation of corrupted errors and correct the errors iteratively. However large corruptions in real-world applications can not be well addressed by these methods. A novel optimization model for robust subspace clustering is proposed in this paper. The objective function of our model mainly includes two parts. The first part aims to achieve a sparse representation of each high-dimensional data point with other data points. The second part aims to maximize the correntropy between a given data point and its low-dimensional representation with other points. Correntropy is a robust measure so that the influence of large corruptions on subspace clustering can be greatly suppressed. An extension of our method with explicit introduction of representation error terms into the model is also proposed. Half-quadratic minimization is provided as an efficient solution to the proposed robust subspace clustering formulations. Experimental results on Hopkins 155 dataset and Extended Yale Database B demonstrate that our method outperforms state-of-the-art subspace clustering methods.
    Proceedings of the 2013 IEEE International Conference on Computer Vision; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cross-modal matching has recently drawn much attention due to the widespread existence of multimodal data. It aims to match data from different modalities, and generally involves two basic problems: the measure of relevance and coupled feature selection. Most previous works mainly focus on solving the first problem. In this paper, we propose a novel coupled linear regression framework to deal with both problems. Our method learns two projection matrices to map multimodal data into a common feature space, in which cross-modal data matching can be performed. And in the learning procedure, the ell_21-norm penalties are imposed on the two projection matrices separately, which leads to select relevant and discriminative features from coupled feature spaces simultaneously. A trace norm is further imposed on the projected data as a low-rank constraint, which enhances the relevance of different modal data with connections. We also present an iterative algorithm based on half-quadratic minimization to solve the proposed regularized linear regression problem. The experimental results on two challenging cross-modal datasets demonstrate that the proposed method outperforms the state-of-the-art approaches.
    Proceedings of the 2013 IEEE International Conference on Computer Vision; 12/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Great progress has been achieved in face recognition in the last three decades. However, it is still challenging to characterize the identity related features in face images. This paper proposes a novel facial feature extraction method named Gabor Ordinal Measures (GOM), which integrates the distinctiveness of Gabor features and the robustness of ordinal measures as a promising solution to jointly handle inter-person similarity and intra-person variations in face images. In the proposal, different kinds of ordinal measures are derived from magnitude, phase, real and imaginary components of Gabor images, respectively, and then are jointly encoded as visual primitives in local regions. The statistical distributions of these visual primitives in face image blocks are concatenated into a feature vector and linear discriminant analysis is further used to obtain a compact and discriminative feature representation. Finally, a two-stage cascade learning method and a greedy block selection method are used to train a strong classifier for face recognition. Extensive experiments on publicly available face image databases such as FERET, AR and large scale FRGC v2.0 demonstrate state-of-the-art face recognition performance of GOM.
    IEEE Transactions on Information Forensics and Security 11/2013; PP(99). DOI:10.1109/TIFS.2013.2290064 · 2.07 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper investigates the problem of cross-modal retrieval, where users can search results across various modalities by submitting any modality of query. Since the query and its retrieved results can be of different modalities, how to measure the content similarity between different modalities of data remains a challenge. To address this problem, we propose a joint graph regularized multi-modal subspace learning (JGRMSL) algorithm, which integrates inter-modality similarities and intra-modality similarities into a joint graph regularization to better explore the cross-modal correlation and the local manifold structure in each modality of data. To obtain good class separation, the idea of Linear Discriminant Analysis (LDA) is incorporated into the proposed method by maximizing the between-class covariance of all projected data and minimizing the within-class covariance of all projected data. Experimental results on two public cross-modal datasets demonstrate the effectiveness of our algorithm.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Subspace clustering via Low-Rank Representation (LRR) has shown its effectiveness in clustering the data points sampled from a union of multiple subspaces. In original LRR, the noise in data is assumed to be Gaussian or sparse, which may be inappropriate in real-world scenarios, especially when the data is densely corrupted. In this paper, we aim to improve the robustness of LRR in the presence of large corruptions and outliers. First, we propose a robust LRR method by introducing the correntropy loss function. Second, a column-wise correntropy loss function is proposed to handle the sample-specific errors in data. Furthermore, an iterative algorithm based on half-quadratic optimization is developed to solve the proposed methods. Experimental results on Hopkins 155 dataset and Extended Yale Database B show that our methods can further improve the robustness of LRR and outperform other subspace clustering methods.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Both image alignment and image clustering are widely researched with numerous applications in recent years. These two problems are traditionally studied separately. However in many real world applications, both alignment and clustering results are needed. Recent study has shown that alignment and clustering are two highly coupled problems. Thus we try to solve the two problems in a unified framework. In this paper, we propose a novel joint alignment and clustering algorithm by integrating spatial transformation parameters and clustering parameters into a unified objective function. The proposed function seeks the lowest rank representation among all the candidates that can represent misaligned images. It is indeed a transformed Low-Rank Representation. As far as we know, this is the first time to cluster the misaligned images using the transformed Low-Rank Representation. We can solve the proposed function by linear zing the objective function, and then iteratively solving a sequence of linear problems via the Augmented Lagrange Multipliers method. Experimental results on various data sets validate the effectiveness of our method.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: The wide deployments of iris recognition systems promote the emergence of different types of iris sensors. Large differences such as illumination wavelength and resolution result in cross-sensor variations of iris texture patterns. These variations decrease the accuracy of iris recognition. To address this issue, a feasible solution is to select an optimal effective feature set for all types of iris sensors. In this paper, we propose a margin based feature selection method for cross-sensor iris recognition. This method learns coupled feature weighting factors by minimizing a cost function, which aims at selecting the feature set to represent the intrinsic characteristics of iris images from different sensors. Then, the optimization problem can be formulated and solved using linear programming. Extensive experiments on the Notre Dame Cross Sensor Iris Database and CASIA cross sensor iris database show that the proposed method outperforms conventional feature selection methods in cross-sensor iris recognition.
    2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR); 11/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Spectral regression has been an efficient and powerful tool for face recognition. However, spectral regression is sensitive to the errors incurred by inaccurate annotation and occlusion. This paper studies robust spectral regression based discriminant subspace learning from correntropy and spatially smooth structure of facial subspace. First, we formulate the robust discriminant subspace learning problem as a maximum correntropy problem, which finds the most correlation solution between spectral targets and predictions. Second, total variation (TV) regularization is imposed on the correntropy objective to learn a spatially smooth face structure. Lastly, based on the additive form of half-quadratic optimization, we cast the maximum correntropy problem into a compound regularization model, which can be efficiently optimized via an accelerated proximal gradient algorithm. Compared with iteratively reweighted least squares based methods, the proposed method can not only improve recognition rates but also reduce computational cost. Experimental results on a couple of face recognition datasets demonstrate the robustness and effectiveness of our method against inaccurate annotation and occlusion.
    Neurocomputing 10/2013; 118:33-40. DOI:10.1016/j.neucom.2013.02.011 · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Low-rank matrix recovery algorithms aim to recover a corrupted low-rank matrix with sparse errors. However, corrupted errors may not be sparse in real-world problems and the relationship between L1 regularizer on noise and robust M-estimators is still unknown. This paper proposes a general robust framework for low-rank matrix recovery via implicit regularizers of robust M-estimators, which are derived from convex conjugacy and can be used to model arbitrarily corrupted errors. Based on the additive form of half-quadratic optimization, proximity operators of implicit regularizers are developed such that both low-rank structure and corrupted errors can be alternately recovered. In particular, the dual relationship between the absolute function in L1 regularizer and Huber M-estimator is studied, which establishes a relationship between robust low-rank matrix recovery methods and M-estimators based robust principal component analysis methods. Extensive experiments on synthetic and real-world datasets corroborate our claims and verify the robustness of the proposed framework.
    IEEE Transactions on Software Engineering 09/2013; 36(4). DOI:10.1109/TPAMI.2013.188 · 5.69 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Sparse recovery aims to find the sparsest solution of an underdetermined system Xβ=yXβ=y. This paper studies simple yet efficient sparse recovery algorithms from a novel viewpoint of convex conjugacy. To this end, we induce a family of convex conjugated loss functions as a smooth approximation of l0-norml0-norm. Then we apply the additive form of half-quadratic (HQ) optimization to solve these loss functions and to reformulate the sparse recovery problem as an augmented quadratic constraint problem that can be efficiently computed by alternate minimization. At each iteration, we compute the auxiliary vector of HQ via minimizer function and then we project this vector into the nullspace of the homogeneous linear system Xβ=0Xβ=0 such that a feasible and sparser solution is obtained. Extensive experiments on random sparse signals and robust face recognition corroborate our claims and validate that our method outperforms the state-of-the-art l1 minimization algorithms in terms of computational cost and estimation error.
    Neurocomputing 09/2013; 115:178–185. DOI:10.1016/j.neucom.2012.12.034 · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: It is necessary to match heterogeneous iris images captured by different types of iris sensors with an increasing demand of interoperable identity management systems. The significant differences among multiple types of iris sensors such as optical lens and illumination wavelength determine the cross-sensor variations of iris texture patterns. Therefore it is a challenging problem to select the common feature set which is effective for all types of iris sensors. This paper proposes a novel optimization model of coupled feature selection for cross-sensor iris recognition. The objective function of our model includes two parts: the first part aims to minimize the misclassification errors; the second part is designed to achieve sparsity in coupled feature spaces based on l2,1-norm regularization. In the training stage, the proposed feature selection model can be formulated as a half-quadratic optimization problem, where an iterative algorithm is developed to obtain the solution. Experimental results on the Notre Dame Cross Sensor Iris Database and CASIA cross sensor iris database show that features selected by the proposed method perform better than those selected by conventional single-space feature selection methods such as Boosting and h regularization methods.
    2013 IEEE 6th International Conference on Biometrics: Theory, Applications and Systems (BTAS); 09/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel nonnegative sparse representation approach, called two-stage sparse representation (TSR), for robust face recognition on a large-scale database. Based on the divide and conquer strategy, TSR decompos 1f0 es the procedure of robust face recognition into outlier detection stage and recognition stage. In the first stage, we propose a general multisubspace framework to learn a robust metric in which noise and outliers in image pixels are detected. Potential loss functions, including L1 , L2,1, and correntropy are studied. In the second stage, based on the learned metric and collaborative representation, we propose an efficient nonnegative sparse representation 64c algorithm to find an approximation solution of sparse representation. According to the L1 ball theory in sparse representation, the approximated solution is unique and can be optimized efficiently. Then a filtering strategy is developed to avoid the computation of the sparse representation on the whole large-scale dataset. Moreover, theoretical analysis also gives the necessary condition for nonnegative least squares technique to find a sparse solution. Extensive experiments on several public databases have demonstrated that the proposed TSR approach, in general, achieves better classification accuracy than the state-of-the-art sparse representation methods. More importantly, a significant reduction of computational costs is reached in comparison with sparse representation classifier; this enables the TSR to be more suitable for robust face recognition on a large-scale dataset.
    IEEE transactions on neural networks and learning systems 01/2013; 24(1):35-46. DOI:10.1109/TNNLS.2012.2226471 · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Active shape model (ASM), as a method for extracting and representing object shapes, has received considerable attention in recent years. In ASM, a shape is represented statistically by a set of well-defined landmark points and its variations are modeled by the principal component analysis (PCA). However, we find that both PCA and Procrustes analysis are sensitive to noise, and there is a linear relationship between alignment error and magnitude of noise, which leads parameter estimation to be ill-posed. In this paper, we present a sparse ASM based on l1-minimization for shape alignment, which can automatically select an effective group of principal components to represent a given shape. A noisy item is introduced to both shape parameter and pose parameter (scale, translation, and rotation), and the parameter estimation is solved by the l1-minimization framework. The estimation of these two kinds of parameters is independent and robust to local noise. Experiments on face dataset validate robustness and effectiveness of the proposed technique.
    Proceedings of the 7th Chinese conference on Biometric Recognition; 12/2012