A Pure L1-norm Principal Component Analysis

Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, VA 23284.
Computational Statistics & Data Analysis (Impact Factor: 1.4). 05/2013; 61:83-98. DOI: 10.1016/j.csda.2012.11.007
Source: PubMed


The L 1 norm has been applied in numerous variations of principal component analysis (PCA). L 1-norm PCA is an attractive alternative to traditional L 2-based PCA because it can impart robustness in the presence of outliers and is indicated for models where standard Gaussian assumptions about the noise may not apply. Of all the previously-proposed PCA schemes that recast PCA as an optimization problem involving the L 1 norm, none provide globally optimal solutions in polynomial time. This paper proposes an L 1-norm PCA procedure based on the efficient calculation of the optimal solution of the L 1-norm best-fit hyperplane problem. We present a procedure called L 1-PCA* based on the application of this idea that fits data to subspaces of successively smaller dimension. The procedure is implemented and tested on a diverse problem suite. Our tests show that L 1-PCA* is the indicated procedure in the presence of unbalanced outlier contamination.

Download full-text


Available from: Edward L. Boone, Apr 10, 2014
1 Follower
52 Reads
  • Source
    • "A line of recent research pursues calculation of L 1 principal components under error minimization [3]-[9]. The error surface is non-smooth and the problem non-convex resisting attempts to guaranteed optimization even with exponential computational cost. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe ways to define and calculate L1-norm signal subspaces which are less sensitive to outlying data than L2-calculated subspaces. We start with the computation of the L1 maximum-projection principal component of a data matrix containing N signal samples of dimension D. We show that while the general problem is formally NP-hard in asymptotically large N, D, the case of engineering interest of fixed dimension D and asymptotically large sample size N is not. In particular, for the case where the sample size is less than the fixed dimension (N < D), we present in explicit form an optimal algorithm of computational cost 2^N. For the case N ≥ D, we present an optimal algorithm of complexity O(N^D). We generalize to multiple L1-max-projection components and present an explicit optimal L1 subspace calculation algorithm of complexity O(N^(DK−K+1)) where K is the desired number of L1 principal components (subspace rank). We conclude with illustrations of L1-subspace signal processing in the fields of data dimensionality reduction, direction-of-arrival estimation, and image conditioning/restoration.
    IEEE Transactions on Signal Processing 05/2014; 62(19). DOI:10.1109/TSP.2014.2338077 · 2.79 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes several principal component analysis (PCA) methods based on Lp-norm optimization techniques. In doing so, the objective function is defined using the Lp-norm with an arbitrary p value, and the gradient of the objective function is computed on the basis of the fact that the number of training samples is finite. In the first part, an easier problem of extracting only one feature is dealt with. In this case, principal components are searched for either by a gradient ascent method or by a Lagrangian multiplier method. When more than one feature is needed, features can be extracted one by one greedily, based on the proposed method. Second, a more difficult problem is tackled that simultaneously extracts more than one feature. The proposed methods are shown to find a local optimal solution. In addition, they are easy to implement without significantly increasing computational complexity. Finally, the proposed methods are applied to several datasets with different values of $p$ and their performances are compared with those of conventional PCA methods.
    06/2013; DOI:10.1109/TCYB.2013.2262936
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Conventional subspace-based signal direction-of-arrival estimation methods rely on the familiar L2-norm-derived principal components (singular vectors) of the observed sensor-array data matrix. In this paper, for the first time in the literature, we find the L1-norm maximum projection components of the observed data and search in their subspace for signal presence. We demonstrate that L1-subspace direction-of-arrival estimation exhibits (i) similar performance to L2 (usual singular-value/eigen-vector decomposition) direction-of-arrival estimation under normal nominal-data system operation and (ii) significant resistance to sporadic/occasional directional jamming and/or faulty measurements.
    SPIE Sensing Technology+Applications, Compressive Sensing Conf., Baltimore, MD; 05/2014
Show more