Page 1

Under-determined Sparse Blind Source Separation

of Nonnegative and Partially Overlapped Data

Yuanchang Sun∗andJack Xin∗

Abstract

We study the solvability of sparse blind separation of n non-negative sources

from m linear mixtures in the under-determined regime m < n. The geomet-

ric properties of the mixture matrix and the sparseness structure of the source

matrix are closely related to the identification of the mixing matrix. We first

illustrate and establish necessary and sufficient conditions for the unique sepa-

ration for the case of m mixtures and m + 1 sources, and develop a novel algo-

rithm based on data geometry, source sparseness, and l1minimization. Then

we extend the results to any order m × n, 3 ≤ m < n based on the degree of

degeneracy of the columns of the mixing matrix. Numerical results substantiate

the proposed solvability conditions, and show satisfactory performance of our

approach.

Key Words: under-determined, non-negative sources, blind separation,

sparseness, uniqueness, geometric method, l1minimization, clustering.

AMS Subject Classifications: 94A12, 65H10, 65K10, 90C05.

∗Department of Mathematics, University of California at Irvine, Irvine, CA 92697, USA.

Page 2

1Introduction

The goal of this paper is to study blind source separation (BSS) problem of non-

negative data when fewer mixture signals than sources are available. Such a case

is referred to as under-determined. The under-determined blind source separation

(uBSS) presents additional challenge than determined or over-determined BSS in that

the mixing matrix is non-invertible. For simplicity, we consider the linear BSS model:

X = AS,

(1.1)

where X ∈ Rm×pis the mixture matrix containing known mixed signals as its rows,

S ∈ Rn×pis the unknown source matrix, A ∈ Rm×nis the unknown mixing matrix.

All matrices are non-negative. The dimensions of the matrices are expressed in terms

of three numbers: (1) p is the number of available samples, (2) m is the number

of mixture signals, and (3) n is the number of source signals. Both X and S are

sampled functions of an acquisition variable which may be time, frequency, position,

or wavenumber depending on the measurement device. The mathematical problem

is to estimate non-negative A and S from X, which is also known as non-negative

matrix factorization (NMF).

BSS has found numerous applications in areas from engineering to neuroscience

[5, 9, 10, 18, 21, 22, 23, 25], and a number of methods have been proposed based on

a priori knowledge of source signals such as spatio-temporal decorrelation, statistical

independence, sparseness, etc. For instance, independent component analysis (ICA

[9, 10, 11, 12, 21, 22, 23]) recovers statistically independent source signals and mixing

matrix A. The statistical independence requires uncorrelated source signals, and this

condition however does not always hold in real world problems. Hence ICA meth-

ods practically look for approximately independent components. Recently there have

been several studies of ICA and its applications in computer tomography, biomedical

image processing, where non-negative constraints are imposed for the mixing matrix

A and/or estimated source signals S [7, 16, 27, 28, 31]. The present work is moti-

vated by the Nuclear Magnetic Resonance (NMR) spectroscopy data which should

not be assumed to satisfy statistical independence, especially when the molecules re-

sponsible for each source may share common structural features [30]. Besides, the

properly phased absorption-mode NMR spectral signals from a single-pulse experi-

ment are positive. Therefore ICA-based methods would not work for this class of

data. Although the NMF introduced by Lee and Seung in [20] does not assume the

statistical independence of the source components, the NMF algorithms in general

converge to different solutions on each run due to the non-convexity of the problem.

For NMR data, a better working assumption is the partial source sparseness condition

proposed by Naanaa and Nuzillard (NN) in [24]. The source signals are only required

to be non-overlapping at some acquisition locations (see NNA in section 2). Such a lo-

cal sparseness condition leads to a dramatic mathematical simplification of a general

non-convex NMF problem. Geometrically speaking, the problem of finding mixing

matrix A reduces to the identification of a minimal cone containing the columns of

mixture matrix X. Linear programming is used to identify the cone in NN’s approach,

while authors of [17] proposed a geometric algorithm called extreme vector algorithm

(EVA) to find the spanning edges of the cone. The working condition for EVA is

1

Page 3

called extreme data property, which essentially is the same as NN’s source condition.

In fact, NN’s sparseness assumption and the geometric construction of columns of

A were known in the 1990’s [1, 37] in the context of blind hyper-spectral unmixing.

The analogue of NN’s assumption is called pixel purity assumption. The resulting

geometric (cone) method is the so called N-findr [37], and is now a benchmark in

hyperspectral unmixing. NN’s method can be viewed as an application of N-findr [37]

to NMR data. However, the NN’s approach and EVA method are designed for solving

the determined or over-determined case m ≥ n. For non-negative uBSS, new methods

need to be developed. First, one may ask: Is the NN source sparseness assumption

good enough for non-negative uBSS ?

There have been several studies on the uBSS of speech signals [7, 33, 34]. However,

few results are available for the uBSS of non-negative and partially overlapped data

(e.g. NMR signals). In [19], the authors extract three source spectra from two mea-

sured mixed spectra in NMR spectroscopy. Their method first recovers the mixing

matrix A by clustering the mixture data in the wavelet domain, then solve for S via

a linear programming. The source signals in [19] are assumed to be nowhere overlap-

ping. Moreover, this method is limited to two mixtures. In this paper, we consider

non-negative signals with overlap and study uBSS for arbitrary number of mixtures.

We are particularly concerned with the conditions for unique solvability of A and

S up to scaling and permutation. Motivated by NN’s sparse condition, we further

explore the geometric structure of column vectors of the mixture matrix. It turns out

that an additional sparseness condition on the sources (besides that of NN’s) and a

delicate one degenerate column condition of the mixing matrix A are needed for the

unique separation in uBSS. Geometrically, NN method can only recover the spanning

edges of a minimal cone containing the column vectors of X, which is not enough in

general to extract all column vectors of the mixing matrix A in uBSS. Our additional

conditions allow the recovery of the remaining columns of A as special interior points

of the cone which lie at intersections of certain hyperplanes. Counter examples are

illustrated if these additional conditions fail. Under the additional conditions on the

sparseness of S and the degree of degeneracy of A, we present a new algorithm which

first retrieves A by combining NN and the geometric property of the mixtures, then

recoveries S by solving an l1minimization problem.

The paper is organized as follows. In section 2, we review the essentials of NN’s

approach and its local sparseness assumption, then show by counter examples that

extra conditions are needed for unique recovery in uBSS. In section 3, we introduce

the extra sparseness condition on the sources to accomplish unique recovery up to

scaling and permutation. The geometric study of mixture matrix suggests that this

condition is in fact optimal. In section 4, we develop a new method of uBSS. We

propose a novel algorithm to identify A based on the data geometry, then solve for

S using l1minimization to ensure a sparse representation. In section 5, numerical

experiments are performed to verify the optimal sparseness condition and test the

effectiveness of our method. Various examples including real world data are prepared

to show the reliability of the method. Additionally, clustering methods are discussed

to recognize the hyperplanes in the geometric structure of the noisy data. In section

6, we generalize the method from treating mixing matrix of order m×(m+1) to any

order m × n, 3 ≤ m < n. Concluding remarks are in section 7.

2

Page 4

This work was partially supported by NSF-ADT grant DMS-0911277. The authors

thank Professor Stanley Osher for his interest and suggestions, and Mr. Jie Feng for

helpful discussions.

2 Source Sparseness and Examples

In [24], Naanaa and Nuzillard (NN) presented an efficient sparse BSS method and its

mathematical analysis for non-negative and partially overlapped signals in the (over)-

determined cases of model (1.1) where m ≥ n. The mixing matrix A is full rank [24].

In simple terms, NN’s key sparseness assumption (NNA) on source signals is that

each source has a stand alone peak at some location of acquisition variable where the

other sources are identically zero. More precisely, the source matrix S ≥ 0 is assumed

to satisfy the following condition:

Assumption (NNA). For each i ∈ {1,2,...,n} there exists an ji∈ {1,2,...,p} such

that si,ji> 0 and sk,ji= 0 (k = 1,...,i − 1,i + 1,...,n) .

If equation (1.1) is written in terms of columns as

Xj=

n

?

k=1

sk,jAk,j = 1,...,p,

(2.1)

the NNA implies that Xji= si,jiAi, i = 1,...,n

(2.1) is rewritten as

or Ai=

1

si,jiXji. Hence equation

Xj=

n

?

i=1

si,j

si,ji

Xji,

(2.2)

which says that every column of X is a non-negative linear combination of the columns

ofˆA. HereˆA = [Xj1,...,Xjn] is the submatrix of X consisting of n columns each of

which is collinear to a particular column of A. It should be noted that ji(i = 1,...,n)

are not known and have to be computed. Once all the ji’s are found, an estimation

of the mixing matrix is obtained. The identification ofˆA’s columns is equivalent to

identifying a convex cone of a finite collection of vectors [15]. The convex cone encloses

the data columns in matrix X, and is the smallest of such cones. Such a minimal

enclosing convex cone can be found by linear programming methods. For model (1.1),

the following constrained equations are formulated for the identification ofˆA,

p

?

j=1,j?=k

Xjλj= Xk,λj≥ 0,k = 1,...,p.

(2.3)

Then a column vector Xkwill be a column ofˆA if and only if the constrained equation

(2.3) is inconsistent (has no solution Xj, j ?= k). However, if noises are present, the

following optimization problems are suggested to estimate the mixing matrix

minimize score = ?

p

?

j=1,j?=k

Xjλj− Xk?2,k = 1,...,p

(2.4)

subject to λj≥ 0 .

(2.5)

3

Page 5

For each column, a score is associated to it. A column with a low score is unlikely to

be a column ofˆA because this column is roughly a non-negative linear combination of

the other columns of X. On the other hand, a high score means that the corresponding

column is far from being a non-negative linear combination of other columns of X.

Practically, the n columns from X with highest scores are selected to formˆA, the

mixing matrix. The Moore-Penrose inverseˆA+ofˆA is then calculated and an estimate

of S is achieved:ˆS =ˆA+X.

The NN method is very efficient in separating the NNA sources for determined

and over-determined BSS problems. It is also robust in that major peaks could still

be recovered when NNA is violated to certain extent. A recent study of the authors’

investigated how to post-process with abundance of mixture data, and how to improve

mixing matrix estimation with major peak based corrections [32]. Here we are inter-

ested in extending NN method to uBSS while maintaining the uniqueness of source

recovery. The following two examples show that extra conditions are necessary.

Example 1: Let (m,n) = (2,3), and assume that the mixing matrix A ∈ R2×3has

pairwise linearly independent columns, and the remaining column is a non-negative

linear combination of the other two. The source matrix satisfies NNA. The NN method

can only detect the two columns which span the cone of column vectors of X, see A1

and A2in Fig. 2.1. The remaining (third) column of A is contained in the cone, but

it is impossible to identify it. Any interior vector in the cone could be a candidate.

Example 2: Let (m,n) = (3,4), and assume that none of A’s four columns is a

non-negative linear combination of the others. The source matrix satisfies NNA. Then

the n columns of A form a convex cone enclosing the data points (columns of X).

NN method recovers all the columns of A’s by identifying the edges of the minimal

bounding convex cone. Fig. 2.2 shows the four column vectors A1,··· ,A4. By proper

scaling of A4, we arrange them on a plane. Any point contained in both triangle

A1A2A3and A1A2A4admit two linear representations, indicating non-uniqueness of

column vector of S.

The second example can be extended to m ≥ 3 as well when columns of data

matrix X admit multiple representations by column vectors of A. The examples

suggest that solving uBSS requires extra sparse conditions on S in addition to NNA,

thereby matrix S has a smaller number of freedom and a unique solution is more

feasible. In the next section, we propose a maximum overlap condition on the NNA

source signals to guarantee a unique separation of A and S.

3Maximum Overlap Condition

Consider m ≥ 2 mixtures and n > m sources. We propose to strengthen NNA by the

(m − 1)-tuplewise maximum overlap condition (MOC) on the source signals:

Assumption (MOC-NNA). For each column of the source matrix S, there are at

most m − 1 nonzero entries. Furthermore, for each i ∈ {1,2,...,n} there exists an

ji∈ {1,2,...,p} such that si,ji> 0 and sk,ji= 0 (k = 1,...,i − 1,i + 1,...,n) .

MOC puts a maximum overlap of m−1 entries or minimum sparseness condition

on the columns of S. Simply said, this condition requires not only that each source has

4