Conference Paper

Proximal Methods for Sparse Hierarchical Dictionary Learning.

Conference: Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel
Source: DBLP


We propose to combine two approaches for mod- eling data admitting sparse representations: on the one hand, dictionary learning has proven ef- fective for various signal processing tasks. On the other hand, recent work on structured spar- sity provides a natural framework for modeling dependencies between dictionary elements. We thus consider a tree-structured sparse regulariza- tion to learn dictionaries embedded in a hierar- chy. The involved proximal operator is com- putable exactly via a primal-dual method, allow- ing the use of accelerated gradient techniques. Experiments show that for natural image patches, learned dictionary elements organize themselves in such a hierarchical structure, leading to an im- proved performance for restoration tasks. When applied to text documents, our method learns hi- erarchies of topics, thus providing a competitive alternative to probabilistic topic models.

Download full-text


Available from: Francis Bach,
31 Reads
  • Source
    • "To avoid the non-convex optimization problem incurred by ℓ 0 -norm, most of the sparse graph based methods [19] [4] [8] [9] [21] [22] replaces ℓ 0 -norm with ℓ 1 -norm so as to solve a convex optimization problem. In addition, ℓ 1 -norm has been widely used as a convex relaxation of ℓ 0 -norm for efficient sparse coding algorithms [12] [13] [14]. [9] points out that in case that the data are drawn from linear independent subspaces, sparse representation by ℓ 1 -norm can recover the underlying subspaces. "
    [Show abstract] [Hide abstract]
    ABSTRACT: $\ell^{1}$-graph \cite{YanW09,ChengYYFH10}, a sparse graph built by reconstructing each datum with all the other data using sparse representation, has been demonstrated to be effective in clustering high dimensional data and recovering independent subspaces from which the data are drawn. It is well known that $\ell^{1}$-norm used in $\ell^{1}$-graph is a convex relaxation of $\ell^{0}$-norm for enforcing the sparsity. In order to handle general cases when the subspaces are not independent and follow the original principle of sparse representation, we propose a novel $\ell^{0}$-graph that employs $\ell^{0}$-norm to encourage the sparsity of the constructed graph, and develop a proximal method to solve the associated optimization problem with the proved guarantee of convergence. Extensive experimental results on various data sets demonstrate the superiority of $\ell^{0}$-graph compared to other competing clustering methods including $\ell^{1}$-graph.
  • Source
    • "Recent years, multi-scale dictionary has received much attention due to its better depiction of the data geometry [3] compared with its single-scale counterpart [2]. Multi-scale representation achieves stronger signal restoration ability [4], and has been effectively applied to many machine learning tasks, including classification [5], novelty detection [6] and topic modelling [7]. We introduce multi-scale dictionary based representation into solving the Poisson compressive sensing (CS) inverse problem. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A novel multi-scale dictionary based Bayesian reconstruction algorithm is proposed for compressive X-ray imaging, which encodes the material's spectrum by Poisson measurements. Inspired by recently developed compressive X-ray imaging systems [1], this work aims to recover the material's spectrum from the compressive coded image by leveraging a reference spectrum library. Instead of directly using the huge and redundant library as a dictionary, which is cumbersome in computation and difficult for selecting those active dictionary atoms, a multi-scale tree structured dictionary is refined from the spectrum library, and following this a Bayesian reconstruction algorithm is developed. Experimental results on real data demonstrate superior performance in comparison with traditional methods.
    ICASSP 2015; 04/2015
  • Source
    • "As an extension of lasso [31], group lasso [5] incorporates the correlation of features inside inner groups and successfully induces the group effect. Furthermore, many other structured sparsity-like methods have been proposed, such as fused lasso [32], elastic net [45], Graph-guide sparsity [10], and tree-structure sparsity [12]. Recently, Mairal et al. proposed path coding penalties [22] to take into account the correlations among features that are embedded in a long-range path on a graph. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The selection of a subset of discriminative features for semantic recognition is crucial to making multimedia analysis more interpretable. This paper proposes a model of spatial path coding (SPC) that uses a supervised technique to select sparse features. SPC is a regularized penalty that encodes the spatial correlations of features obtained by the spatial pyramid model. In SPC, each feature dimension is considered as a vertex in a direct acyclic graph (DAG), and the spatial correlations among features are considered as directed edges associated with predefined weights. Thus, the process of supervised feature selection can be directly formulated to solve a path selection problem with minimum cost. Experiments are conducted to evaluate the performance of supervised feature selection with SPC for the tasks of scene classification and action recognition using four benchmark datasets. The results show that SPC can be used to automatically select a subgraph of the DAG with a small number of discriminative features for a certain category. In addition, the method proposed in this paper shows better performance in terms of classification and recognition accuracy as compared with state-of-the-art algorithms.
    Information Sciences 10/2014; 281:523–535. DOI:10.1016/j.ins.2014.03.093 · 4.04 Impact Factor
Show more