A Pure L1-norm Principal Component Analysis

Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, VA 23284.
Computational Statistics & Data Analysis (Impact Factor: 1.4). 05/2013; 61:83-98. DOI: 10.1016/j.csda.2012.11.007
Source: PubMed


The L 1 norm has been applied in numerous variations of principal component analysis (PCA). L 1-norm PCA is an attractive alternative to traditional L 2-based PCA because it can impart robustness in the presence of outliers and is indicated for models where standard Gaussian assumptions about the noise may not apply. Of all the previously-proposed PCA schemes that recast PCA as an optimization problem involving the L 1 norm, none provide globally optimal solutions in polynomial time. This paper proposes an L 1-norm PCA procedure based on the efficient calculation of the optimal solution of the L 1-norm best-fit hyperplane problem. We present a procedure called L 1-PCA* based on the application of this idea that fits data to subspaces of successively smaller dimension. The procedure is implemented and tested on a diverse problem suite. Our tests show that L 1-PCA* is the indicated procedure in the presence of unbalanced outlier contamination.

Download full-text


Available from: Edward L. Boone, Apr 10, 2014
  • Source
    • "One can apply standard non-linear optimization schemes to (1.4), e.g., sequential rank-one updates [19], alternating optimization [20] (a.k.a. coordinate descent), the Wiberg algorithm [10], augmented Lagrangian approaches [36], successive projections on hyperplanes and linear programming [5], to cite a few. The main drawback of this class of methods is that it does not guarantee to recover the global optimum of (1.4) and is in general sensitive to initialization. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The low-rank matrix approximation problem with respect to the component-wise $\ell_1$-norm $\ell_1$-LRA), which is closely related to robust principal component analysis (PCA), has become a very popular tool in data mining and machine learning. Robust PCA aims at recovering a low-rank matrix that was perturbed with sparse noise, with applications for example in foreground-background video separation. Although $\ell_1$-LRA is strongly believed to be NP-hard, there is, to the best of our knowledge, no formal proof of this fact. In this paper, we prove that $\ell_1$-LRA is NP-hard, already in the rank-one case, using a reduction from MAX CUT. Our derivations draw interesting connections between $\ell_1$-LRA and several other well-known problems, namely, robust PCA, $\ell_0$-LRA, binary matrix factorization, a particular densest bipartite subgraph problem, the computation of the cut norm of $\{-1,+1\}$ matrices, and the discrete basis problem, which we all prove to be NP-hard.
  • Source
    • "A line of recent research pursues calculation of L 1 principal components under error minimization [3]-[9]. The error surface is non-smooth and the problem non-convex resisting attempts to guaranteed optimization even with exponential computational cost. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe ways to define and calculate L1-norm signal subspaces which are less sensitive to outlying data than L2-calculated subspaces. We start with the computation of the L1 maximum-projection principal component of a data matrix containing N signal samples of dimension D. We show that while the general problem is formally NP-hard in asymptotically large N, D, the case of engineering interest of fixed dimension D and asymptotically large sample size N is not. In particular, for the case where the sample size is less than the fixed dimension (N < D), we present in explicit form an optimal algorithm of computational cost 2^N. For the case N ≥ D, we present an optimal algorithm of complexity O(N^D). We generalize to multiple L1-max-projection components and present an explicit optimal L1 subspace calculation algorithm of complexity O(N^(DK−K+1)) where K is the desired number of L1 principal components (subspace rank). We conclude with illustrations of L1-subspace signal processing in the fields of data dimensionality reduction, direction-of-arrival estimation, and image conditioning/restoration.
    IEEE Transactions on Signal Processing 05/2014; 62(19). DOI:10.1109/TSP.2014.2338077 · 2.79 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes several principal component analysis (PCA) methods based on Lp-norm optimization techniques. In doing so, the objective function is defined using the Lp-norm with an arbitrary p value, and the gradient of the objective function is computed on the basis of the fact that the number of training samples is finite. In the first part, an easier problem of extracting only one feature is dealt with. In this case, principal components are searched for either by a gradient ascent method or by a Lagrangian multiplier method. When more than one feature is needed, features can be extracted one by one greedily, based on the proposed method. Second, a more difficult problem is tackled that simultaneously extracts more than one feature. The proposed methods are shown to find a local optimal solution. In addition, they are easy to implement without significantly increasing computational complexity. Finally, the proposed methods are applied to several datasets with different values of $p$ and their performances are compared with those of conventional PCA methods.
    06/2013; DOI:10.1109/TCYB.2013.2262936
Show more