Content uploaded by Nicolas Nadisic
Author content
All content in this area was uploaded by Nicolas Nadisic on May 05, 2021
Content may be subject to copyright.
Exact Biobjective k-Sparse Nonnegative Least
Squares
Nicolas Nadisic, Arnaud Vandaele, Nicolas Gillis
Universit´
e de Mons
Mons, Belgium
{firstname.lastname}@umons.ac.be
Jeremy E. Cohen
Universit´
e de Rennes, INRIA, CNRS, IRISA
Rennes, France
jeremy.cohen@irisa.fr
Abstract—The k-sparse nonnegative least squares (NNLS)
problem is a variant of the standard least squares problem, where
the solution is constrained to be nonnegative and to have at most
knonzero entries. Several methods exist to tackle this NP-hard
problem, including fast but approximate heuristics, and exact
methods based on brute-force or branch-and-bound algorithms.
Although intuitive, the k-sparse constraint is sometimes limited;
the parameter kcan be hard to tune, especially in the case
of NNLS with multiple right-hand sides (MNNLS) where the
relevant kcould differ between columns. In this work, we propose
a novel biobjective formulation of the k-sparse nonnegative least
squares problem. We present an extension of Arborescent, a
branch-and-bound algorithm for exact k-sparse NNLS, that com-
putes the whole Pareto front (that is, the set of optimal solutions
for all values of k) instead of only the k-sparse solution, for
virtually the same computing cost. We also present a method for
MNNLS that enforces a matrix-wise sparsity constraint, by first
computing the Pareto front for each column and then selecting
one solution per column to build a globally optimal solution
matrix. We show the advantages of the proposed approach for
the unmixing of hyperspectral images.
Index Terms—sparse approximation, `0constraint, biobjective
optimization, nonnegative least squares
I. INTRODUCTION
Nonnegative least squares (NNLS) problems occur in many
signal processing and data mining tasks, when data points
can be expressed as additive linear combinations of basis
components [1]. For example, in hyperspectral images, the
spectral signature of a pixel is the additive linear combination
of the spectral signatures of the materials it contains [2]. NNLS
problems are also at the heart of many alternating algorithms
to compute nonnegative matrix factorization (NMF) [3].
Given A∈Rm×rand b∈Rm, the standard NNLS problem
can be written as follows,
min
xkAx −bk2
2such that x≥0,(1)
where x∈Rr, and x≥0means xis entry-wise nonnegative.
Nonnegativity in least squares problems is known to natu-
rally induce sparsity (see Theorem 6.1 in [4]), that is, solutions
with few nonzero entries. Sparsity is an appreciated feature,
as it often improves the interpretability of the solution even
NN and NG acknowledge the support by the European Research Council
(ERC starting grant No 679515), and by the Fonds de la Recherche Sci-
entifique - FNRS and the Fonds Wetenschappelijk Onderzoek - Vlanderen
(FWO) under EOS project O005318F-RG47.
for invertible linear systems. For example, in hyperspectral
unmixing, that is, the task of identifying materials in a
hyperspectral image, sparsity means a pixel is expressed as
a combination of only a few materials. However, there is
no guarantee on the sparsity of the solution of a general
NNLS problem, while some applications may benefit from
explicit sparsity constraints. Leveraging prior knowledge of
the sparsity of the solution can help regularize the problem,
reduce noise, and improve the results.
The most natural sparsity measure is the `0-“norm”, as it is
equal to the number of nonzero entries of a vector, kxk0=
Card{i:xi6= 0}. A vector is said k-sparse if it has at most k
nonzero entries. A common way to enforce sparsity is with a
k-sparsity constraint. Combined with nonnegativity, this leads
to the following problem, called k-sparse NNLS,
min
xkAx −bk2
2such that x≥0and kxk0≤k. (2)
This problem is sometimes called nonnegative sparse coding
or cardinality-constrained NNLS.
In hyperspectral unmixing, the k-sparsity constraint means
a pixel can be composed of at most kmaterials. Albeit
quite intuitive, this formulation still suffers from the need
to choose an appropriate parameter k. Most importantly, in
NNLS problems with multiple right-hand sides (MNNLS), the
suitable koften varies from one column to another (pixels can
contain different numbers of materials), and imposing a single
sparsity parameter can produce inadequate results.
To overcome this issue, instead of optimizing the error while
constraining the sparsity, one can consider a biobjective formu-
lation. Here, the objectives are minimizing the reconstruction
error on the one hand, and maximizing the sparsity (that is,
minimizing the `0-“norm”) on the other hand,
min
x≥0{kAx −bk2
2,kxk0}.(3)
These objectives are conflicting, hence there is no unique
optimal solution to Problem (3), and solutions representing
different tradeoffs between error and sparsity are all equally
good. Therefore, we consider the notion of Pareto-optimality.
Given a set of objectives to optimize, a solution xis said
Pareto-optimal if and only if it is not dominated, that is, there
does not exist a solution that is at least as good as xon all
objectives and strictly better than xon a least one objective.
The set of all Pareto-optimal solutions to a problem is called
the Pareto front, see Figure 1. In Problem (3), the discreteness
0kxk0
0
kAx −bk2
2
1 2 3 4 r= 5
x= 0
kbk2
2
x∈argminx≥0kAx −bk2
2
Fig. 1. Example of the Pareto front for a biobjective k-sparse NNLS problem
with r= 5 variables. The first solution, for kxk0= 0, corresponds to the zero
vector. The last solution, for kxk0= 5, corresponds to the NNLS problem
with no sparsity constraint. Here the penultimate solution is identical to the
last one, meaning that the solution with no sparsity constraint has naturally 1
zero entry.
of the `0-“norm” actually makes the computation of the Pareto
front easier, as it suffices to solve Problem (2) for all possible
values of kxk0in {1,2, . . . , r}. By computing the Pareto front
instead of just one solution, we provide the user with a set
of solutions to choose from, representing different tradeoffs
between error and sparsity, and remove the need to define a
parameter ka priori. Note that, when rank(A)< m, which
includes the underdetermined case m<r, the optimal solution
of Problem (2) is not necessarily unique. This does not change
the principles of the methods presented in this work; they
return one optimal solution among the possible ones.
In this paper, we tackle Problem (3) exactly. In section II,
we review briefly existing approaches for sparse NNLS. In sec-
tion III, we describe the existing branch-and-bound algorithm
Arborescent (abbreviated Arbo), upon which our approach is
built. In section IV, we introduce our novel extension of Arbo
to tackle Problem (3) exactly. In section V, we present our
approach to leverage this extension in the context of sparse
MNNLS. In section VI, we illustrate the proposed method
with the unmixing of hyperspectral images.
II. RE LATE D WO RK
The discreteness of the `0-“norm” makes problems like (2)
combinatorial, and thus hard to solve. For this reason, many
approximate methods have been used.
The most common one is to use the `1-norm as a convex
relaxation of the `0-“norm” in order to leverage the efficient
algorithms and strong theoretical results from convex opti-
mization. The `1-penalized problem minxkAx−bk2+λkxk1is
called the LASSO, and several nonnegative variants have been
studied, see for example [5]. However, these methods suffer
from several drawbacks. Tuning the parameter λto reach a
target sparsity can be tricky, especially in MNNLS where the
adequate λcan vary between columns. Although there exist
conditions under which `1methods are guaranteed to produce
a solution with the same support as the `0method, they are
quite restrictive in practice [6].
Greedy heuristics are also widely used. These methods start
with an empty support, and select entries one by one to add to
the support, until the target sparsity kis reached. The selection
is done greedily, by choosing at each iteration the entry that
maximizes the decrease of the error. Orthogonal variants make
sure that entries are not selected more than once; orthogonal
matching pursuit (OMP) and orthogonal least squares (OLS)
are the most popular algorithms. Recently, nonnegative vari-
ants have been studied, see [7] and the references therein. They
solve (2) approximately, and recovery guarantees depend on
conditions that can be restrictive in practice.
A few methods were proposed to solve exactly `0-
constrained problems, similar to (2) but with different con-
straints. Reference [8] introduced a branch-and-cut algorithm
using continuous relaxations of the `0-“norm”. It was latter
extended and improved, see [9] and the references therein.
Reference [9] introduced mixed-integer programming (MIP)
formulations for several variants involving the `0-“norm” (to
be able to solve them with a generic MIP solver), and [10]
proposed dedicated branch-and-bound algorithms to solve
them. Finally, [11] introduced a branch-and-bound algorithm
specifically designed for k-sparse NNLS; this is the foundation
our work is based upon.
III. THE ARBORESCENT ALGORITHM
In this section, we briefly describe the algorithm Arbo
introduced in [11]. This algorithm solves k-sparse NNLS
(2) exactly using a branch-and-bound strategy. Instead of
enumerating all possible supports, it uses the structure of the
problem to prune large parts of the search space. This search
space is mapped on a tree, see Figure 2. Every node of the tree
represents an over-support Kof x, that is, the set of entries of x
not constrained to be zero, with K ⊆ {1,2, . . . , r}. Exploring
a node means solving the NNLS subproblem
f∗(K) = min ||A(:,K)x(K)−b||2
2such that x(K)≥0,
where x(K)is the subvector composed of the entries of x
indexed by K. The value f∗(K)is the error associated to the
node corresponding to K. We solve the NNLS subproblems
using an active-set method [12]. On top of solving the sub-
problems exactly, this method supports a warm start, that is,
it can be initialized at a given node with the solution from a
previous node. This significantly speeds up the computation as
the initial guess at each node is close to the optimal solution.
X=[x1x2x3x4x5]
root node, unconstrained
k0≤r= 5
X = [0 x2x3x4x5]
X=[00x3x4x5]
X=[000x4x5] X = [0 0 x30x5] X = [0 0 x3x40] k0≤2 = k→stop
X = [0 x20x4x5] X = [0 x2x30x5]... k0≤3
X=[x10x3x4x5]... k0≤4
Fig. 2. Example of the Arbo search tree, for r= 5 and k= 2.
The root node represents the NNLS problem with no
sparsity constraint, and every descending node represents this
problem with one entry constrained to be zero. This is done
recursively, until reaching the nodes with kunconstrained
entries. The nodes at this depth are leafs of the search tree, and
represent feasible solutions to problem (2). To prune this tree,
we use the fact that in any optimization problem, when adding
constraints, the solution cannot improve. By construction, a
given node will always have an error greater than (or equal
to) the error of its parent node. When we reach a leaf, we
obtain a feasible solution whose error is an upper bound for
problem (2). Therefore, if a given node Nhas an error greater
than this bound, then all children nodes descending from N
will also have a error greater than the bound, and thus cannot
be optimal solutions; Ncan be pruned safely. Moreover, by
ordering the entries in the root node by ascending order and
then exploring depth-first and “left-first”, we first constrain to
zero the entries that are already close to zero in the standard
NNLS problem, and therefore that are more likely to be zero
in the constrained problem. This strategy leads quickly to good
feasible solutions and allows to prune efficiently large parts of
the search space. Other technical choices that are key in the
performance of the algorithm are detailed in [11].
IV. THE BIOBJECTIVE EXTENSION
Although Arbo was designed to solve the k-sparse NNLS
problem, it can be easily extended to compute the whole Pareto
front. Indeed, while exploring the search tree and computing
intermediary nodes, it also computes automatically the optimal
k0-sparse solutions for all k0∈ {k,...,r}. Our extension (that
we call Arbo-Pareto) therefore consists in maintaining a list of
the optimal k0-sparse solutions, making a comparison for every
node explored, and updating the list when a better solution
is found. This almost does not affect the computational cost
of the algorithm as the cost of comparison and update is
negligible compared to the cost of the exploration of a node.
To show that Arbo indeed computes these solutions, we
prove by contradiction that it cannot prune an optimal k0-
sparse solution. Suppose the node γis the optimal k0-sparse
solution for a given k0> k, and that it is pruned by Arbo.
Because it is pruned, its error must be larger than the error of
some feasible k-sparse solution α,f∗(γ)> f∗(α). However,
there exists necessarily a parent node of αthat is k0-sparse;
we call it β. By construction, f∗(α)> f∗(β), so we have
f∗(γ)> f∗(β), meaning that γis not the optimal k0-sparse
solution. This contradicts the hypothesis.
Arbo-Pareto is detailed in Algorithm 1. NNLS refers to
the active-set method described above. The set Pis the
pool of nodes, initialized with the root node (with no entry
constrained) on line 4. A node is selected from Pon line 8,
and removed from Pon line 9. On line 10, the NNLS
subproblem restricted to the over-support Kis solved using
the parent solution as initialization. If the error at the current
node is worse than the current best feasible solution, then no
descending node can be optimal, and we prune the current
node (line 13). Otherwise, we continue the exploration. If the
sparsity target kis not reached, we generate one node for every
entry of the over-support (lines 16 and 17). We then compare
the error of the current node with the error of the current best
Algorithm 1: Arbo-Pareto
Input: A∈Rm×r
+,b∈Rm
+,k∈ {1,2, . . . , n}
Output: Pareto front S
1Init K0← {1, ..., r}
2Init x0←NNLS(A, b)
3Sort the entries in x0in ascending order
4Init P← {(K0, x0)}
5Init Ei←+∞for all i∈ {k, . . . , r}
6Init Si←~
0for all i∈ {k, . . . , r}
7while P6=∅do
8(K, xparent)=P.select()
9P←P\ {(K, xparent)}
10 x, error ←NNLS(A(:,K), b, xparent(K))
11 k0=size(K)
12 if error > Ekthen
13 prune (do nothing)
14 else
15 if k0> k then
16 foreach i∈ K do
17 P←P∪ {(K \ {i}, x)}
18 if error < Ek0then
19 Ek0←error
20 Sk0←x
k0-sparse solution, on line 18. If it is lower, we update this
error (line 19) and the Pareto front (line 20).
Note that, if k= 1, Arbo computes the whole Pareto front.
Otherwise, it only computes the part with k0∈ {k,...,r}. The
extended algorithm Arbo-Pareto can be used to solve sparse
NNLS problems in a biobjective way, but it can also be used
as a subroutine in a MNNLS algorithm with a matrix-wise
sparsity constraint, as described in the next section.
V. MATRIX-WISE SPARSITY CONSTRAINT IN MNNLS
In MNNLS problems, that occur for example in alternating
algorithms for NMF, sparsity is usually enforced column-wise,
as follows,
min
X≥0kAX −Bk2
Fsuch that ∀j, kX(:, j)k0≤k(4)
where B∈Rm×nis a given data matrix, A∈Rm×ris a given
dictionary, X∈Rr×n
+is the solution matrix we compute and
X(:, j)denotes the jth column of X. Problem (4) can be
decomposed into nindependent k-sparse NNLS problems of
the form (2).
Although this formulation is intuitive (a data point is
composed of at most kbasis components), it can be limiting
in some contexts. Notably, when sparsity varies between
columns, setting the right kcan be tricky. This is often the
case in hyperspectral images where pixels contain different
numbers of materials. To overcome this issue, we consider a
matrix-wise sparsity constraint,
min
X≥0kAX −Bk2
Fsuch that kXk0≤q, (5)
where qis a matrix-wise sparsity parameter, thus enforcing an
average sparsity q/n on the columns of X.
Theoretically, we could solve problem (5) with any al-
gorithm for k-sparse NNLS (such as greedy algorithms, or
even Arbo), by vectorizing the problem. However, this would
not be tractable computationally for large instances, as the
resulting NNLS problem would have dimensions mn by rn.
To the best of our knowledge, only one previous work [13]
considered problem (5) in its matrix nature. It proposed an
algorithm to solve it approximately in two steps. First, it
applies a homotopy method to generate a regularization path
for every column, that is, a set of solutions representing
different tradeoffs between error and sparsity. This first step
is only approximate because the homotopy method relies on
an `1-penalized formulation, so there is no guarantee that
the solutions computed correspond to the real Pareto front
of the biobjective k-sparse NNLS problem. Second, it selects
one solution per column to build a solution matrix Xthat
minimizes the error while respecting a matrix-wise sparsity
constraint; this is done with a greedy-like algorithm, that is
very cheap but was shown to optimally solve the selection
subproblem.
Here, we use a similar approach, but we replace the homo-
topy method of the first step by our algorithm Arbo-Pareto. We
call this new approach Arbo+sel. After computing the Pareto
front for every column, we build a cost matrix Cwhere every
entry C(k0, j)is the error of the k0-sparse solution (with k0
between kand r) of the jth column of X. Then, we select
one solution per column to build an optimal solution matrix
X, following the greedy-like method from [13]. In a nutshell,
we consider one cursor per column of C, and begin with all
cursors at zero, meaning that the zero vector is selected for
each column. Then, at each iteration, we choose one cursor to
increment such that the error decrease is maximized. We stop
when the sparsity target qis reached. This greedy selection
is globally optimal because the squared Frobenius norm is
separable by columns.
If Arbo-Pareto is run for each column with k= 1, then the
proposed algorithm Arbo+sel provides the globally optimal
solution for (5), because the selection subproblem is also
solved exactly.
VI. EX PE RI ME NT S
In this section, we study the performance of the proposed
approach Arbo+sel on the unmixing of 4 hyperspectral images.
A hyperspectral image can be represented as a matrix B
where each column corresponds to a pixel and each row to
a different wavelength. The rcolumns of the dictionary A
represent the spectral signature of the pure materials (also
called endmembers) present in the image [2]. Given Band
A, we compute X, whose columns represent the abundance
of materials in each pixel. Most pixels contain only a few
endmembers [14], therefore it makes sense to enforce sparsity
on X. We consider the 4 widely used hyperspectral images1
1Downloaded from http://lesun.weebly.com/hyperspectral-data- set.html
Samson, Jasper, Urban, and Cuprite, and we use as dictionaries
(that is, for the matrix A) the ground truths from [15]. The
characteristics of the data are summarized in Table I. The
number mcorresponds to the number of wavelengths, nto
the number of pixels, and rto the number of endmembers in
the ground truth; B∈Rm×nand A∈Rm×r.
TABLE I
SUM MARY O F TH E DATASET S STU DI ED
Dataset m n r
Samson 156 95 ×95 = 9025 3
Jasper 198 100 ×100 = 1000 4
Urban 162 307 ×307 = 94249 6
Cuprite 188 250 ×191 = 47750 12
Our method, described in section V, is noted Arbo+sel. We
run Arbo-Pareto with k= 1 to compute the whole Pareto
front, and then apply the selection strategy to build Xwith a
matrix-wise sparsity constraint. We compare the performance
of Arbo+sel with 3 other methods:
•An active-set algorithm that solves the NNLS problem
with no sparsity constraint, noted AS. This is equivalent
to exploring only the root node in Arbo.
•The original Arbo algorithm with a column-wise k-
sparsity constraint, noted Arbo k-s.
•The algorithm from [13], that solves problem (5) approx-
imately with a homotopy method followed by a matrix-
wise selection. It is noted Ht+sel.
All algorithms are implemented in Julia. They are mono-
threaded, and executed on a computer with a processor Intel
Core i5-8350U @1.70GHz. Source code and scripts are pro-
vided in an online repository2.
For every dataset, we run the 4 algorithms and measure the
average column sparsity of the solutions (number of entries
larger than 10−3divided by the number of columns, after
a normalization of the columns so that the maximum per
column is 1), the relative error kAX−BkF
kBkF, and the running
time (median over 10 runs). Jasper and Urban are processed
once with all algorithms for k=q/n = 2, and once with
Ht+sel and Arbo+sel for q/n = 1.8, which is not possible
with other algorithms.
The results of the experiments for the unmixing of hyper-
spectral images are shown on Table II. Time is in seconds and
the relative error is in percent. Without sparsity constraint,
we observe that the results are already quite sparse. The
column-wise k-sparse method Arbo produces solutions with
an average sparsity below the target k, meaning that many
columns are actually sparser than the target. Logically, all
methods enforcing sparsity increase the reconstruction error.
However, this loss is limited for Ht+sel and even smaller for
Arbo+sel. Arbo+sel is always better than the other sparsity-
enforcing methods, which is expected as it is the only one
that solves problem (5) exactly. Arbo-based methods show an
increase in computing time, but it is reasonable for the datasets
with small r. The overcost of Arbo-sel compared to Arbo k-s
2https://gitlab.com/nnadisic/giant.jl
TABLE II
RES ULTS O F THE E XPE RI MEN TS
AS Arbo k-s Ht+sel Arbo+sel
Samson Time 0.10 0.19 0.25 0.43
r= 3 Rel error 3.30 3.40 3.30 3.30
k= 2 Sparsity 2.19 1.83 2.0 2.0
Jasper Time 0.13 0.42 0.40 0.80
r= 4 Rel error 5.71 6.18 5.72 5.71
k= 2 Sparsity 2.23 1.78 2.0 2.0
Jasper Time 0.39 0.78
r= 4 Rel error 5.95 5.74
q/n = 1.8Sparsity 1.8 1.8
Urban Time 2.19 13.26 6.66 29.63
r= 6 Rel error 7.67 8.27 7.83 7.71
k= 2 Sparsity 2.62 1.83 2.0 2.0
Urban Time 6.52 29.22
r= 6 Rel error 8.22 7.80
q/n = 1.8Sparsity 1.8 1.8
Cuprite Time 1.53 224.23 6.82 1408.5
r= 12 Rel error 1.74 1.94 2.01 1.83
k= 4 Sparsity 6.60 3.81 4.0 4.0
(a) Active-set (no sparsity constraint) (b) Arbo with k= 2
(c) Ht+sel with q/n = 1.8(d) Arbo+sel with q/n = 1.8
Fig. 3. Abundance maps of the sixth endmember from the unmixing of
the Urban hyperspectral image (that is, sixth row of Xreshaped) by several
algorithms.
is due to the fact that, for the latter, we set k= 1 to generate
the whole Pareto front. When rgrows, the computing time
grows exponentially; this is the main limitation of our method.
However, in applications such as hyperspectral unmixing, r
is typically small. Also, these applications are generally not
in real-time, thus the computing time is not critical, and the
overcost of our method can be accepted when an exact result
is needed.
Some abundance maps of one endmember of the Urban
image are shown on Figure 3. It corresponds to pixels from
rooftops. Visually, with no sparsity constraint, the image
is pretty noisy and it includes pixels from other materials.
The column-wise Arbo reduces a little noise from other
endmembers, but it also adds noise to the rooftop pixels (some
zones are blurry and pixelated). Ht+sel removes most of the
noise, but it also loses a lot of information from the rooftop
pixels (some relevant zones are blacked out). Arbo+sel reduces
significantly the noise while preserving most of the rooftop
pixels, with a more distinct separation.
VII. CONCLUSION
We proposed Arbo-Pareto, an extension to the algorithm
Arborescent to compute the Pareto front of the biobjective k-
sparse NNLS problem, that is, the set of optimal k0-sparse so-
lutions for different values of k0. We also proposed Arbo+sel, a
way to leverage this extension to solve exactly multiple right-
hand sides NNLS with a matrix-wise sparsity constraint, by
computing a Pareto front for every column and then applying
an optimal selection strategy. We showed that, for a modest
increase in computing cost, Arbo+sel brings improvement over
existing methods in the unmixing of hyperspectral images. It
scales well and is applicable to large datasets, as long as the
rank ris small.
REFERENCES
[1] D. D. Lee and H. S. Seung, “Unsupervised learning by convex and conic
coding,” in Advances in neural information processing systems, 1997,
pp. 515–521.
[2] J. M. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader,
and J. Chanussot, “Hyperspectral unmixing overview: Geometrical,
statistical, and sparse regression-based approaches,” IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing,
vol. 5, no. 2, pp. 354–379, 2012.
[3] N. Gillis, Nonnegative Matrix Factorization. SIAM, 2020.
[4] C. L. Byrne, Applied Iterative Methods. AK Peters, 2008.
[5] P. O. Hoyer, “Non-negative sparse coding,” in IEEE Workshop on Neural
Networks for Signal Processing, 2002, pp. 557–565.
[6] J. E. Cohen and N. Gillis, “Nonnegative Low-rank Sparse Component
Analysis,” in ICASSP, 2019, pp. 8226–8230.
[7] T. T. Nguyen, J. Idier, C. Soussen, and E.-H. Djermoune, “Non-
Negative Orthogonal Greedy Algorithms,” IEEE Transactions on Signal
Processing, pp. 1–16, 2019.
[8] D. Bienstock, “Computational study of a family of mixed-integer
quadratic programming problems,” Mathematical programming, vol. 74,
no. 2, pp. 121–140, 1996.
[9] S. Bourguignon, J. Ninin, H. Carfantan, and M. Mongeau, “Exact Sparse
Approximation Problems via Mixed-Integer Programming: Formulations
and Computational Performance,” IEEE Transactions on Signal Process-
ing, vol. 64, no. 6, pp. 1405–1419, 2016.
[10] R. B. Mhenni, S. Bourguignon, and J. Ninin, “Global Optimization for
Sparse Solution of Least Squares Problems,” Preprint hal-02066368,
2019.
[11] N. Nadisic, A. Vandaele, N. Gillis, and J. E. Cohen, “Exact Sparse
Nonnegative Least Squares,” in ICASSP, 2020, pp. 5395 – 5399.
[12] L. F. Portugal, J. J. Judice, and L. N. Vicente, “A comparison of block
pivoting and interior-point algorithms for linear least squares problems
with nonnegative variables,” Mathematics of Computation, vol. 63, no.
208, pp. 625–643, 1994.
[13] N. Nadisic, A. Vandaele, and N. Gillis, “A homotopy-based algorithm
for sparse multiple right-hand sides nonnegative least squares,” Preprint
arXiv:2011.11066, 2020.
[14] W.-K. Ma, J. M. Bioucas-Dias, T.-H. Chan, N. Gillis, P. Gader, A. J.
Plaza, A. Ambikapathi, and C.-Y. Chi, “A signal processing perspective
on hyperspectral unmixing: Insights from remote sensing,” IEEE Signal
Processing Magazine, vol. 31, no. 1, pp. 67–81, 2013.
[15] F. Zhu, “Hyperspectral unmixing: ground truth labeling, datasets, bench-
mark performances and survey,” Preprint arXiv:1708.05125, 2017.