PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

In this paper we consider a Nonnegative Matrix Factorization (NMF) model on complex numbers, in particular, we propose a group complex-NMF (cNMF) model that subsumes the phase-consistency complex NMF for the audio Blind Source Separation (aBSS). Using Wirtinger calculus, we propose a gradient-based algorithm to solve cNMF. The algorithm is then further accelerated using a heuristic extrapolation scheme. Numerical results show that the accelerated algorithm has significantly faster convergence.
Fast algorithm for complex-NMF with application to source separation
Andersen Ang1,3Valentin Leplat2Nicolas Gillis3
1Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Canada
2ICTEAM, Universit´
e catholique de Louvain, Louvain-la-Neuve, Belgium
3Math´
ematique et recherche op´
erationnelle, Universit´
e de Mons, Mons, Belgium
ms3ang@uwaterloo.ca, valentin.leplat@uclouvain.be, nicolas.gillis@umons.ac.be
Abstract—In this paper we consider a Nonnegative Matrix Fac-
torization (NMF) model on complex numbers, in particular, we
propose a group complex-NMF (cNMF) model that subsumes the
phase-consistency complex NMF for the audio Blind Source Sep-
aration (aBSS). Using Wirtinger calculus, we propose a gradient-
based algorithm to solve cNMF. The algorithm is then further
accelerated using a heuristic extrapolation scheme. Numerical
results show that the accelerated algorithm has significantly faster
convergence.
Index Terms—Nonnegative Matrix Factorization, Blind source
separation, Phase, Wirtinger calculus, Algorithm, Extrapolation
I. INTRODUCTION
Nonnegative Matrix Factorization (NMF) [1], [2] is the
problem of finding two nonnegative matrices, Wand H, from
a matrix XRm×n
+such that XuWH. Factors W,H
are usually computed by tackling a non-convex optimization
problem which are notoriously hard to solve. By extending
from Rm×nto Cm×n, complex NMF (cNMF) introduces more
variables, which makes the problems even more difficult to
solve. Given XCm×nand r, it is defined as follows
min
W∈W,H∈H,Θ
D(X|F) + R(F),F:=`(W,H,eiΘ),(cNMF)
where the sets W=Rm×r
+,H=Rr×n
+,Ω=[π, π]m×n×G
are all convex, the function Dmeasures the distance between
Xand F. For simplicity in this paper we focus on the
Euclidean norm, using D(X|F) = 1
2kXFk2
F. We refer the
reader to [3] for the general case of β-divergences. The term
Ris a regularization known as the consistency; which will be
discussed in Section II-B.
The estimator Fis computed via the function `:W ×H ×
Cm×nthat produces a matrix to fit X. Three examples
of `are:
rank-1 1-phase: F=WH eiΘ,(1a)
rank-1 multi-phase: F=
r
X
j=1 wjhjeiΘj,(1b)
grouped multi-phase: F=
G
X
j=1 WrjHrjeiΘj,(1c)
AA thanks C. F´
evotte for introducing the works of Le Roux. This project
has received funding from the European Research Council (ERC) under the
European Union’s Horizon 2020 research and innovation program (grant
agreement No. 788368). NG acknowledges the support by ERC starting
grant No 679515, the Fonds de la Recherche Scientifique and the Fonds
Wetenschappelijk Onderzoek under EOS project O005318F-RG47.
where W,Hare real and eiΘis complex, is the Hadamard
product, i=1and ei(·)is the component-wise complex
exponential. For fitting the phase angle of X, Model (1a)
introduces the angle Θ. Let wj,hjdenote the jth column
of Wand jth row of H, respectively. We have
WH eiΘ=r
X
j=1
wjhjeiΘ,
so (1b) considers the more general case where each rank-1
factor wjhjhas its own phase eiΘj; this model was introduced
in earlier papers on complex NMF [4], [5].
In this paper we introduce (1c), which we refer to as the
grouped multi-phase model as a natural generalization of (1b);
see Section II.
Application cNMF finds applications whose data belongs to
the complex domain, a typical example is the audio Blind
Source Separation (aBSS) [6]–[8]. It is also related to phase
retrieval [9] and finds applications in optics. In this paper
we use aBSS as an illustration, which we discuss further in
Section II.
Contribution and organization This paper has two contribu-
tions. First we introduce cNMF with (1c), which generalizes
cNMF (1b) studied in [4], [5]; see also [10]. The model (1c)
comes naturally from (1b), but it has never been addressed, to
the best of our knowledge. Then, the main contribution of this
paper comes in the design of a single algorithm framework
for solving cNMF that covers the variants in (1). Although
W,Hin cNMF are real, in general we can treat W,H,Θ
as complex variables, and use a systematic approach based on
the notion of Wirtinger calculus [11], [12] to design gradient-
type algorithm to solve cNMF; see Section III. Furthermore,
we adapt the heuristic extrapolation strategy from [13]–[15]
to our proposed Wirtinger gradient scheme to solve cNMF,
which significantly accelerates its convergence; see Section IV
for numerical experiments.
Scope of the paper This paper focuses on algorithmic
aspects. We do not benchmark the proposed algorithms with
other frameworks on performance w.r.t. aBSS as dedicated
data preprocessing, post-processing and parameter tuning are
required.
II. CNMF MO DE LS A ND A BSS
This section serves as the literature review and background
of the paper. We first present how real NMF as in [7], [8]
is used to solve single-channel aBSS as shown in Fig.1, then
we move to the cNMF previously studied in [4], [5]. Finally
we describe how to generalize these model to the grouped
multi-phase model.
  
 
 
  
  
  


``Time-frequency domain’’
Real-valued
``Time-domain’’
 :th source
  : observed data
: number of points in
: STFT and iSTFT
,
  windows
  : spectrogram
: maximum frequency bin
: maximum time frame
: coordinate in






  
: amplitude spectrogram
  
: frequency profile
  
: time activation
: factorization rank
: th column of
: th row of
  : phase spectrogram
: th component spectrogram
: th source estimate
Complex-valued Real-valued
Fig. 1. The processing pipeline as in [7], [8] on using NMF to solve aBSS.
A. Real NMF on single-channel aBSS
Let [r]:={1,2,··· , r }. Assume raudio sources {sj
RL}j[r]are linearly mixed as x=Pjsj, aBSS aims to
recover sjfrom the observation of their mixture xRL. We
follow the data processing pipeline (Fig. 1) to estimate these
sources. First we compute X=Ψω(x), where Ψωdenotes the
Short-Time Fourier Transform (STFT) with an analysis win-
dow ω, and Xis the spectrogram containing the distribution of
“energy” of the signal across time-frequency coordinates. Such
information can be extracted by NMF methods through the
computation of W,Hthat respectively capture the frequency
and temporal patterns [8], [16]. NMF is used here since both
the frequency spectrum and time activations are nonnegative.
For aBSS with NMF methods, the input matrix Vtypically
corresponds to the amplitude spectrogram derived from Xas
V(k, m) = |X(k, m)|for all k , m. Then we find W,Hby
solving the (real) NMF problem:
[W,H] = min
W,H
1
2kVWHk2
Fs.t. W∈ W,H∈ H.(2)
From WH, each rank-1 matrix wjhjRK×M
+is converted
to a component spectrogram YjCK×Mby multiplying
wjhjelement-wise with the phase of the mixture spectrogram
Xas follows: Yj=wjhjeiΘ, where phase spectrogram
eiΘCK×Mis defined by Θ(k, m) = X(k, m)with
the argument function that returns the angle of the input
complex number. Finally, Ψ
˜
ωYjis computed to generate
the vector yjRL, where Ψ
˜
ωis the pseudo-inverse of Ψ
associated to the synthesis windows ˜
ω.
Usual assumptions when using NMF for aBSS Under some
conditions we expect the vectors yjto accurately estimate the
sources sj. Two of these assumptions are:
Consistency: each Yjis consistent w.r.t. to Ψ.
Rank-1 source: each source is well-approximated by a
rank-1 matrix.
Relaxing these assumptions lead to model (1b) and (1c).
B. Complex NMF and consistency
By directly consider factorizing Xinstead of V, we arrive
at cNMF (1a). We now explain the regularizer term. As
Fourier transform is not surjective, this implies that a matrix
PCK×Mdoes not necessarily corresponds to the STFT of
a vector. In that case, we say that Pis not consistent w.r.t.
Ψω[5], [10], [17]–[19]. Since the phase is not considered
for the computation of W,Hin (2), it is possible for the
aforementioned pipeline to produce inconsistent Yj, although
Xis always consistent. To generate consistent solutions, we
can consider the phase-consistent [5], [10], [17]–[19] cNMF:
min
W∈W,H∈H,Θ
D(X|F) + R(F),R(F) = λ
2kE(F)k2
F(3)
where λis a penalty weight (nonnegative scalar), and the term
Eis the phase-inconsistency error defined as
E(F):=FΨωΨ
˜
ωF=BF,B:=IΨωΨ
˜
ω,(4)
we call a matrix Pconsistent if E(P) = 0.
C. Multi-phase and group factorization
We now discuss how (3) can be generalized by cNMF (1b),
which is then naturally extended to cNMF (1c).
Multi-phase Note that there is only one Θin (3), meaning all
the rrank-1 component amplitude spectrograms wjhjshare
the same phase. The multi-phase model considers the general
case that different sources can have a different phases. In this
case, ΘRm×n×ris a 3rd-order tensor, where the jth frontal
slice of it, denoted as Θj, corresponds to the phase angle of
the jth source estimate. This leads to the model cNMF (1b).
Group factorization NMF [7], [8] and cNMF (3) assume
each source is well-approximated by a rank-1 matrix in the
factorization, and the factorization rank rcorresponds to
the number of sources. Such an assumption only holds for
relatively simple data, and in practice complex real-world
audio sources cannot be well-approximated by a rank-1 term.
Here we consider using a grouping of rank-1 components to
approximate the amplitude part of a source, in this way we
arrive at cNMF (1c), where Wrj,Hrjare rank-rjnonnegative
matrices with rjcolumns or rows, respectively. In this case,
we are considering a rank-rNMF with r=PG
i=1 riand Gis
the total number of groups (number of sources). Immediately,
we see a drawback of such model: having more parameters, in
particular the parameters r1,·· · , rGwhich correspond to the
model orders. As model order selection is a research topic on
its own, in this paper we do not deal with the strategy to tune
these parameters, instead we fix their value for the numerical
experiments based on simple prior inspection of the data. For
example, a simple heuristic is to use the SVD to numerically
check what is the rank of each portion of the data.
Now, we see that real NMF (2) cNMF (1a) =(3) cNMF
(1b) cNMF (1c). Thus it is meaningful to study how to
effectively solve cNMF, which is the main subject of the next
section. Note that fitting a complex matrix is in general harder
than fitting a real matrix, this can be easily understood since
the equality between two numbers is a stronger assumption
on Cthan Ras it requires the equality in both modulus and
phase. Furthermore, we recall that in [4], [5], sophisticated
algorithms based on the MM framework with multiplicative-
type update are derived to solve cNMF (1a), which are not
suitable here since they cannot directly be used to solve
cNMF (1c). Moreover, these algorithms can potentially have
slow convergence rates. In fact, multiplicative-type updates
are well-known to convergence slower than gradient-based
algorithms for NMF when using the Frobenius norm [20]; see
also [2], [21] and the references therein. In the next section,
we derive a fast gradient-based algorithm to solve cNMF.
III. ALGORITHMS FOR CNMF MODELS
Here we first review the background on Wirtinger calculus
(W-calculus) [11] following similar approach presented in [22,
Section 6], and then we present our method to solve cNMF.
For details on W-calculus, see [12].
A. W-derivatives on real-valued function of complex variable
In general, functions defined on Cnare not holomorphic,
i.e., they are not complex-differentiable on their domain. A
first naive approach would be to consider a related function
defined on R2nsuch that calculus rules on Rcan be used.
However, this approach can be rather tedious. W-calculus pro-
vides an alternative equivalent formulation, with compact and
elegant notation. The core ideas are the W-differential operator
and the use of conjugate coordinate [z, z]>when computing
the full gradient. Denote zf=∂f
∂z for a differentiable f
and let z= ¯z, the W-derivative w.r.t. z=x+iy C,
x, y R, and its conjugate z:=xiy, respectively, are
zf=:=1
2(xfi∂yf), and zf:=1
2(xf+i∂yf). If fis
real-valued,
(zf)=zf. (5)
For real-valued multi-variable function f:Cn3z=
[z1, z2...,zn]>w=f(z)R, the partial gradient w.r.t.
zand z, and the full gradient are:
zf:=
z1f
.
.
.
znf
, ∂zf:=
z
1f
.
.
.
z
nf
and f:=
zf
zf
,
(6)
where the full gradient fconsists of the partial gradients
zfand zf. As we work on real-valued f, then applying
(5) in (6) means that we only need to compute one partial
gradient, as the other one is its conjugate. We are now ready
to discuss how to solve cNMF.
B. The gradient-update steps
We now discus how to solve cNMF. For simplicity here we
focus on cNMF with F=WH eiΘ, that is, (1a). The same
approach applies to the other cNMF models. In general, we
solve (3) by the (inexact) Block Coordinate Descent (BCD):
we alternate on solving each sub-problem on one block of
variables while fixing the others at their most recent value.
We now discuss how to solve each subproblem.
Subroblem in WThis subproblem has the following form
min
W0f:=1
2k(WH)DXk2
F+λ
2kB(WH)Dk2
F,(7)
where D=eiΘ. We solve (7) by iterating the projected
gradient step (proximal gradient step)
Wk+1 =Re nargmin
WDWf(Wk,Hk,Θk),WE
+Lk
W
2kWWkk2o+i+(Re{W})
=hRenWk1
Lk
WWf(Wk)oi+
(8)
where Re(·)takes the real part, Lk
Wis the Lipschitz constant
of the gradient at iteration k,i+is the indicator function of the
nonnegative orthant that encodes the non-negativity constraint,
and [·]+= max{· , }with a small positive value. Using
W-derivative, the gradient w.r.t. the Hermitian WHis
WHf=HDH(QF X)>:=g(W>),(9)
where Q:=I+λBHBand Fis a linear function of W:
F= (WH)D.(10)
As Wis real, by (5), the gradient of fw.r.t. Wis the transpose
of g(W>). For the Lipschitz constant LWof the gradient (9),
let W=W1W2and g=g(W>
1)g(W>
2), we have
kgk2
(9)
=
HDHhQF>(W>
1)F>(W>
2)i
2
(10)
=
HDHhQWHD>i
2
H
2
2
DHD>
2kQk2
|{z }
LW
kWk2.
(11)
Subproblem in HBy taking transpose, the subproblem on
His symmetric to (7), we use the same approach to solve it.
Suproblem in ΘBased on (7), the subproblem in Θis
min
D∈D f(D):=1
2kADXk2
F+λ
2kB(AD)k2
F,(12)
where A=WH and D=eiΘ. Note that we use the change
of variable Dto replace Θas the variable, and note that since
complex exponential has unit norm, this requires an additional
constraint [D]ij= 1, so that Din (12) is the set of complex
matrices whose elements have a magnitude equal to one. Note
that the set Dis nonconvex, which affects the convergence
analysis; see the discussion in Section III-C.
To solve (12) and update D, we follow the same approach
as for (8), and use
Dk+1 =PDnDk1
Lk
DDf(Dk)o,(13)
where PDis the projection onto the set D. Take Das W
in (7) with Hset to I, the the same update (9) can be used
with LD=kAAk2kQk2. Lastly, we get Θkfrom Dkby
Θ=D.
C. The overall algorithm and convergence
Algorithm 1 shows the simplified pseudo-code of the overall
algorithm. At first glance it looks tempting to apply some well-
established convergence analyses for, say proximal gradient
method and block coordinate descent to derive the theoretical
convergence of the algorithm. However, we emphasize that
the algorithm is a heuristic and we currently do not have a
rigorous proof of convergence. It is important to note that
cNMF involves complex variables, while the aforementioned
convergence analyses are all built upon the assumption that the
objective function sits in R, hence it is unclear how to apply
convergence analysis from the real case on the complex case.
Furthermore, if it turns out the “real analysis” can be applied
on such a complex problem, the convergence analysis is still
nontrivial as the update step of Dinvolves a projection onto
a nonconvex set, leading to theoretical complications from the
projection PDin the update (13).
Although we do not have theoretical convergence, we have
observed empirically that the algorithm decreases the objective
function; see Section IV and in particular Figure 3.
Algorithm 1: Alternating Projected Wirtinger gradient
Result: W,H,Θthat solves cNMF (1a)
Initialize W0,H0,Θ0,D0=eiΘ0,Q=I+λBHB;
for k= 1,2, . . . do
Get Wk+1 using (8) and (11);
Get Hk+1,Dk+1 similarly as W;
Get Θk+1 from Dk+1;
end
D. Acceleration by heuristic extrapolation
Algorithm 1 can be accelerated using extrapolation, and we
adopt the framework of Heuristic Extrapolation and Restart
(HER) [13]–[15], which has been shown to be very ef-
fective on accelerating the convergence of NMF-type algo-
rithms with the Frobenius norm. In a nutshell, HER is a
numerical extrapolation scheme based on the extrapolation on
the sequence {Wk,Hk}k=1,2,... using an auxiliary sequence
{ˆ
Wk,ˆ
Hk}k=1,2,... as follows:
Wk+1 =Update(ˆ
Wk,ˆ
Hk),
ˆ
Wk+1 =Wk+1 +βk(Wk+1 Wk)+,
Hk+1 =Update(ˆ
Wk+1,ˆ
Hk),
ˆ
Hk+1 =Hk+1 +βk(Hk+1 Hk)+,
where Update can be the gradient update as in (8), and
βkis the extrapolation parameter automatically tuned based
on the HER scheme. The acceleration effect of HER comes
from the combination of extrapolation, cheap restart (safe
guard mechanism) and a numerical scheme on updating the
extrapolation weight βk; see [14] for more details. Lastly, we
emphasize that currently there is no theoretical convergence
guarantee for the HER framework, but HER empirically works
well in practice; see Section IV.
Remark on HER In principle we can also extrapolate the
variable Din the same way as Wand H. However we do
not perform extrapolation on the variable D. The reason is that
we empirically observed that not extrapolating Dperforms
better, although extrapolating Dis still much faster than the
unextrapolated original gradient-descent algorithm.
Note that currently the theoretical understanding of the
HER mechanism is still very limited. Let us try to give a
partial explanation of the ineffectiveness of extrapolating D
in the HER framework on cNMF. Since Dis to be projected
element-wise such that [D]ij= 1, that is, all the elements
of Dsits on the unit circle on the complex plane, then the
effect of extrapolating Dultimately is just a rotation on the
elements of D, and we hypothesize that the HER framework
is not a good set up for performing this kind of rotational
extrapolation, since it was original proposed for speed up NMF
computation in the Frobenius norm in the Euclidean geometry
on the rectangular coordinate system. Understanding how to
extrapolate Deffectively is an interesting future research topic.
IV. EXP ER IM EN T
We present numerical results to confirm the effectiveness of
the proposed algorithm, and to show case the effectiveness of
the acceleration. The code is available at angms.science.
Before we go to the details of the experiment, we remark that:
As mentioned in the beginning of the paper, the task
of aBSS itself contains highly involved data processing
pipeline such as parameter tuning, data pre-processing,
filtering and post-processing, which is not the focus of
this paper, so we do not focus on these issues and demon-
strate that the proposed algorithm has a fast convergence
in practice. We set λ=D(X|F0)
R(F0)in all the experiments,
where F0=`(W0,H0,eiΘ0)and W0,H0,Θ0are
initialization of the variables W,H,Θ. This makes the
two terms in the objective function well balanced.
We do not benchmark with the algorithms proposed in
[4], [5] because the codes are unavailable. See also the
discussion in the last paragraph in Section II-C.
We run Algorithm 1 (with and without HER, with the same
random initialization) on two data sets (see Fig. 2), which are
both real-world recording with some ambient noise: “Mary”,
XC256×714, a piano music phrase “Mary had a little lamb”;
and “Voice”, XC256×162, a speech of the letters “NMF”.
We run cNMF (1a) on Mary with r∈ {3,6}and on Voice with
r= 75. We run group cNMF (1c) on Voice with G= 3 and
two sets of rj:r1=r2=r3= 15 and r1=r2=r3= 25,
(recall that r=PG
j=1 rj). All the experiments are repeated 20
times with different random initialization of (W0,H0,Θ0).
kXk
200 400 600
Mary
20
40
60
80
100
120
5
10
15
20
25
6X
200 400 600
20
40
60
80
100
120
-2
0
2
50 100 150
Voice
20
40
60
80
100
120
1
2
3
50 100 150
20
40
60
80
100
120
-2
0
2
Fig. 2. The spectrograms of the two datasets Mary and Voice.
Fig. 3 and Fig. 4 show the convergence results. First we
see that the HER framework significantly accelerates the
convergence. Second, comparing the result on cNMF (1a)
on Voice with r= 75 to group cNMF (1c) on Voice with
G= 3, r1=r2=r3= 25 so r=PG
j=1 rj= 75, the group
models have a lower objective function value. This is expected
as group model uses more phase variables to fit the data.
200 400 600 800 1000
#105
2
3
5
7Mary, r= 3
200 400 600 800 1000
1.5e+05
2.5e+05
5e+05
106Mary, r= 6
200 400 600 800 1000
104
106
108
Voice, r= 75
Algorithm 1 Algorithm 1
with HER
Fig. 3. Convergence of the objective function values D+Ron cNMF tests.
x-axes are iteration. We do not show the time plots as the cost per iteration
of both algorithm is almost identical (HER adds a negligible cost [13]).
100 200 300 400 500
103
104
105
106r1=r2=r3= 15
100 200 300 400 500
103
104
105
106
r1=r2=r3= 25
Fig. 4. Convergence curves on group cNMF tests. Here the curves are the
mean over 20 experiments.
V. CONCLUSION
In this paper, we introduced the group cNMF model that
subsumes the existing cNMF and real NMF models. Using
Wirtinger calculus, we derived a general gradient-based algo-
rithm to solve cNMF. By using the heuristic extrapolation with
restart, we showed that, on a few preliminary numerical tests,
the accelerated algorithm has a much faster convergence.
Future works include studying the convergence of the al-
gorithm, performing tests with respect to the task of aBSS,
and consider replacing the Frobenius norm in the data fitting
term by β-divergences which is more appropriate for audio
data sets, and also study identifiability issues as in [8].
REFERENCES
[1] S. A. Vavasis, “On the complexity of nonnegative matrix factorization,”
SIAM Journal on Optimization, vol. 20, no. 3, pp. 1364–1377, 2010.
[2] N. Gillis, Nonnegative Matrix Factorization. SIAM, Philadeplhia, 2020.
[3] P. Magron and T. Virtanen, “Towards complex nonnegative matrix factor-
ization with the beta-divergence,” in 2018 16th International Workshop
on Acoustic Signal Enhancement (IWAENC). IEEE, 2018, pp. 156–160.
[4] H. Kameoka, N. Ono, K. Kashino, and S. Sagayama, “Complex nmf: A
new sparse representation for acoustic signals,” in 2009 IEEE Interna-
tional Conference on Acoustics, Speech and Signal Processing. IEEE,
2009, pp. 3437–3440.
[5] J. Le Roux, H. Kameoka, E. Vincent, N. Ono, K. Kashino, and
S. Sagayama, “Complex nmf under spectrogram consistency con-
straints,” in Acoustical Society of Japan Autumn Meeting, 2009.
[6] P. Magron, R. Badeau, and B. David, “Phase recovery in nmf for audio
source separation: an insightful benchmark,” in 2015 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP).
IEEE, 2015, pp. 81–85.
[7] V. Leplat, N. Gillis, X. Siebert, and A. M. S. Ang, “S´
eparation aveugle
de sources sonores par factorization en matrices positives avec p´
enalit´
e
sur le volume du dictionnaire,” in XXVII`
eme Colloque francophone de
traitement du signal et des images. GRETSI 2019, 2019.
[8] V. Leplat, N. Gillis, and A. M. S. Ang, “Blind audio source separation
with minimum-volume beta-divergence nmf,IEEE Transactions on
Signal Processing, 2020.
[9] D. Lahat and C. F´
evotte, “Positive semidefinite matrix factorization: A
link to phase retrieval and a block gradient algorithm,” in ICASSP 2020-
2020 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP). IEEE, 2020, pp. 5705–5709.
[10] P. Magron, R. Badeau, and B. David, “Complex nmf under phase
constraints based on signal modeling: application to audio source sep-
aration,” in 2016 IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP). IEEE, 2016, pp. 46–50.
[11] W. Wirtinger, “Zur formalen theorie der funktionen von mehr komplexen
ver¨
anderlichen,” Mathematische Annalen, vol. 97, no. 1, pp. 357–375,
1927.
[12] P. Bouboulis, “Wirtinger’s calculus in general hilbert spaces,arXiv
preprint arXiv:1005.5170, 2010.
[13] A. M. S. Ang and N. Gillis, “Accelerating nonnegative matrix factoriza-
tion algorithms using extrapolation,” Neural computation, vol. 31, no. 2,
pp. 417–439, 2019.
[14] A. M. S. Ang, J. E. Cohen, N. Gillis, and L. T. K. Hien, “Extrapolated al-
ternating algorithms for approximate canonical polyadic decomposition,”
in ICASSP 2020-2020 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 3147–3151.
[15] ——, “Accelerating block coordinate descent for nonnegative tensor
factorization,” Numerical Linear Algebra with Applications, 2021.
[16] P. Smaragdis and J. C. Brown, “Non-negative matrix factorization for
polyphonic music transcription,” in 2003 IEEE Workshop on Appli-
cations of Signal Processing to Audio and Acoustics (IEEE Cat. No.
03TH8684). IEEE, 2003, pp. 177–180.
[17] J. Le Roux, H. Kameoka, N. Ono, and S. Sagayama, “Fast signal
reconstruction from magnitude stft spectrogram based on spectrogram
consistency,” in International Conference on Digital Audio Effects, 2010.
[18] J. Bronson and P. Depalle, “Phase constrained complex nmf: Separating
overlapping partials in mixtures of harmonic musical sources,” in
2014 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP). IEEE, 2014, pp. 7475–7479.
[19] D. Kitamura and K. Yatabe, “Consistent independent low-rank matrix
analysis for determined blind source separation,” EURASIP Journal on
Advances in Signal Processing, vol. 2020, no. 1, pp. 1–35, 2020.
[20] C.-J. Lin, “Projected gradient methods for nonnegative matrix factoriza-
tion,” Neural computation, vol. 19, no. 10, pp. 2756–2779, 2007.
[21] N. Gillis and F. Glineur, “Accelerated multiplicative updates and hi-
erarchical als algorithms for nonnegative matrix factorization,Neural
computation, vol. 24, no. 4, pp. 1085–1105, 2012.
[22] E. J. Candes, X. Li, and M. Soltanolkotabi, “Phase retrieval via wirtinger
flow: Theory and algorithms,IEEE Transactions on Information The-
ory, vol. 61, no. 4, pp. 1985–2007, 2015.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Independent low-rank matrix analysis (ILRMA) is the state-of-the-art algorithm for blind source separation (BSS) in the determined situation (the number of microphones is greater than or equal to that of source signals). ILRMA achieves a great separation performance by modeling the power spectrograms of the source signals via the nonnegative matrix factorization (NMF). Such a highly developed source model can solve the permutation problem of the frequency-domain BSS to a large extent, which is the reason for the excellence of ILRMA. In this paper, we further improve the separation performance of ILRMA by additionally considering the general structure of spectrograms, which is called consistency, and hence, we call the proposed method Consistent ILRMA. Since a spectrogram is calculated by an overlapping window (and a window function induces spectral smearing called main- and side-lobes), the time-frequency bins depend on each other. In other words, the time-frequency components are related to each other via the uncertainty principle. Such co-occurrence among the spectral components can function as an assistant for solving the permutation problem, which has been demonstrated by a recent study. On the basis of these facts, we propose an algorithm for realizing Consistent ILRMA by slightly modifying the original algorithm. Its performance was extensively evaluated through experiments performed with various window lengths and shift lengths. The results indicated several tendencies of the original and proposed ILRMA that include some topics not fully discussed in the literature. For example, the proposed Consistent ILRMA tends to outperform the original ILRMA when the window length is sufficiently long compared to the reverberation time of the mixing system.
Conference Paper
Full-text available
Tensor decompositions have become a central tool in machine learn-ing to extract interpretable patterns from multiway arrays of data.However, computing the approximate Canonical Polyadic Decom-position (aCPD), one of the most important tensor decompositionmodel, remains a challenge. In this work, we propose several algo-rithms based on extrapolation that improve over existing alternatingmethods for aCPD. We show on several simulated and real data setsthat carefully designed extrapolation can significantly improve theconvergence speed hence reduce the computational time, especiallyin difficult scenarios
Article
Full-text available
We propose a general framework to accelerate significantly the algorithms for nonnegative matrix factorization (NMF). This framework is inspired from the extrapolation scheme used to accelerate gradient methods in convex optimization and from the method of parallel tangents. However, the use of extrapolation in the context of the exact coordinate descent algorithms tackling the nonconvex NMF problems is novel. We illustrate the performance of this approach on two state-of-the-art NMF algorithms: accelerated hierarchical alternating least squares and alternating nonnegative least squares, using synthetic, image, and document data sets. arXiv preprint version : https://www.researchgate.net/publication/325216713_Accelerating_Nonnegative_Matrix_Factorization_Algorithms_using_Extrapolation
Conference Paper
Full-text available
Complex nonnegative matrix factorization (NMF) is a powerful tool for decomposing audio spectrograms while accounting for some phase information in the time-frequency domain. While its estimation was originally based on the Euclidean distance, in this paper we propose to extend it to any beta-divergence, a family of functions widely used in audio to estimate NMF. To this end, we introduce the beta-divergence in a heuristic fashion within a phase-aware probabilistic model. Estimating this model results in performing an NMF with Itakura-Saito (IS) divergence on a quantity called the phase-corrected posterior power of the sources, which is both phase-dependent and nonnegative-valued. Therefore, we replace IS with the beta-divergence, so that the factorization uses an optimal distortion metric and remains phase-aware. Even though by doing so we loose theoretical convergence guarantees, the resulting algorithm demonstrates its potential for an audio source separation task, where it outperforms previous complex NMFs approaches.
Conference Paper
Full-text available
Nonnegative Matrix Factorization (NMF) is a powerful tool for de- composing mixtures of audio signals in the Time-Frequency (TF) domain. In the source separation framework, the phase recovery for each extracted component is necessary for synthesizing time-domain signals. The Complex NMF (CNMF) model aims to jointly estimate the spectrogram and the phase of the sources, but requires to con- strain the phase in order to produce satisfactory sounding results. We propose to incorporate phase constraints based on signal models within the CNMF framework: a phase unwrapping constraint that enforces a form of temporal coherence, and a constraint based on the repetition of audio events, which models the phases of the sources within onset frames. We also provide an algorithm for estimating the model parameters. The experimental results highlight the interest of including such constraints in the CNMF framework for separating overlapping components in complex audio mixtures.
Conference Paper
Full-text available
Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In applications such as source separation, the phase recovery for each extracted component is a major issue since it often leads to audible artifacts. In this paper, we present a methodology for evaluating various NMF-based source separation techniques involving phase reconstruction. For each model considered, a comparison between two approaches (blind separation without prior information and oracle separation with supervised model learning) is performed, in order to inquire about the room for improvement for the estimation methods. Experimental results show that the High Resolution NMF (HRNMF) model is particularly promising, because it is able to take phases and correlations over time into account with a great expressive power.
Article
Considering a mixed signal composed of various audio sources and recorded with a single microphone, we consider on this paper the blind audio source separation problem which consists in isolating and extracting each of the sources. To perform this task, nonnegative matrix factorization (NMF) based on the Kullback-Leibler and Itakura-Saito β-divergences is a standard and state-of-the-art technique that uses the time-frequency representation of the signal. We present a new NMF model better suited for this task. It is based on the minimization of β-divergences along with a penalty term that promotes the columns of the dictionary matrix to have a small volume. Under some mild assumptions and in noiseless conditions, we prove that this model is provably able to identify the sources. In order to solve this problem, we propose multiplicative updates whose derivations are based on the standard majorization-minimization framework. We show on several numerical experiments that our new model is able to obtain more interpretable results than standard NMF models. Moreover, we show that it is able to recover the sources even when the number of sources present into the mixed signal is overestimated. In fact, our model automatically sets sources to zero in this situation, hence performs model order selection automatically.
Conference Paper
This paper examines complex non-negative matrix factorization (CMF) as a tool for separating overlapping partials in mixtures of harmonic musical sources. Unlike non-negative matrix factorization (NMF), CMF allows for the development of source separation procedures founded on a mixture model rooted in the complex-spectrum domain (in which the superposition of overlapping sources is preserved). This paper introduces a physically motivated phase constraint based on the assumption that the source's pitch is sufficient in specifying the phase evolution of the harmonics over time, uniting sinusoidal modelling of acoustic sources with the CMF analysis of their spectral representations. The CMF-based separation procedure, armed with this novel phase constraint, is demonstrated to offer a superior performance to NMF when employed as a tool for separating overlapping partials in the acoustic test cases considered.
Article
We study the problem of recovering the phase from magnitude measurements; specifically, we wish to reconstruct a complex-valued signal x of C^n about which we have phaseless samples of the form y_r = |< a_r,x >|^2, r = 1,2,...,m (knowledge of the phase of these samples would yield a linear system). This paper develops a non-convex formulation of the phase retrieval problem as well as a concrete solution algorithm. In a nutshell, this algorithm starts with a careful initialization obtained by means of a spectral method, and then refines this initial estimate by iteratively applying novel update rules, which have low computational complexity, much like in a gradient descent scheme. The main contribution is that this algorithm is shown to rigorously allow the exact retrieval of phase information from a nearly minimal number of random measurements. Indeed, the sequence of successive iterates provably converges to the solution at a geometric rate so that the proposed scheme is efficient both in terms of computational and data resources. In theory, a variation on this scheme leads to a near-linear time algorithm for a physically realizable model based on coded diffraction patterns. We illustrate the effectiveness of our methods with various experiments on image data. Underlying our analysis are insights for the analysis of non-convex optimization schemes that may have implications for computational problems beyond phase retrieval.