from book Robot Localization by Echo State Networks Using RSS (pp.99-106)
Proportionate Algorithms for Blind Source Separation
In this paper we propose an extension of time-domain Blind Source Separation algorithms by applying the well known proportionate and improved proportionate adaptive algorithms. These algorithms, known in the context of adaptive filtering, are able to use the sparseness of acoustic impulse responses of mixing environments and give better performances than standard algorithms. Some preliminary experimental results show the effectiveness of the proposed approach in terms of convergence speed.
Proportionate Algorithms for Blind Source Separation
Michele Scarpiniti, Danilo Comminiello, Simone Scardapane,
Raffaele Parisi, and Aurelio Uncini
Department of Information Engineering, Electronics and Telecommunications (DIET),
“Sapienza” University of Rome, via Eudossiana 18, 00184, Rome
Abstract. In this paper we propose an extension of time-domain Blind Source
Separation algorithms by applying the well known proportionate and improved
proportionate adaptive algorithms. These algorithms, known in the context of
adaptive ﬁltering, are able to use the sparseness of acoustic impulse responses
of mixing environments and give better performances than standard algorithms.
Some preliminary experimental results show the effectiveness of the proposed
approach in terms of convergence speed.
Keywords: Blind Source Separation, Independent Component Analysis,
Proportionate algorithms, Improved proportionate.
Blind Source Separation (BSS) applied to speech and audio signals is an attractive re-
search topic in the ﬁeld of adaptive signal processing [1,2]. The problem is to recover
original sources froma set of mixtures recorded in an unknownenvironment. While sev-
eral well-performing approaches exist when the mixing environment is instantaneous,
some problems arise in convolutive environments.
Several solutions were proposed to solve BSS in a convolutive environment [3,2].
Some of these solutions work in time domain, others in frequency domain. Each of them
have some advantagesand disadvantages, but there is not a unique winner approach.
In addition, when working with speech and audio signals, convergence speed is an
important task to be performed. Since impulse responses of standard environments,
e.g. ofﬁce rooms, are quite sparse, some authors have proposed to incorporate sparse-
ness in the learning algorithm [5,6]. The idea is to introduce a weighting matrix in
the update equation, that can give more emphasis to the most important part of the im-
pulse response. In particular a proportionate  and an improvedproportionate  algo-
rithms were proposed for supervised signal processing applications, like acoustic echo
In this paper we aim to extend these proportionate algorithms to the BSS problem,
in the hope that they can be effective also for unsupervised case. Hence a proportion-
ate and an improved proportionate version of the well-known time-domain Torkkola’s
algorithm [8,9] will be proposed. Some preliminary results, that demonstrate the effec-
tiveness of the proposed idea in terms of convergence speed, are also presented.
S. Bassis et al. (eds.), Recent Advances of Neural Network Models and Applications,99
Smart Innovation, Systems and Technologies 26,
DOI: 10.1007/978-3-319-04129-2_10, c
Springer International Publishing Switzerland 2014
100 M. Scarpiniti et al.
The rest of the paper is organized as follows: Section 2 introduces the BSS problem
in convolutive environments. Then Section 3 describes the proposed algorithm, while
Section 4 shows some experimental results. Finally Section 5 draws our conclusions.
2 Blind Source Separation for Convolutive Mixtures
Let us consider a set of Nunknown and independent sources denoted as
s[n]=[s1[n],...,sN[n]]T, such that the components si[n]are zero-mean and mutually in-
dependent. Signals received by an array of Msensors are denoted by
x[n]=[x1[n],...,xM[n]]Tand are called mixtures. For simplicity we consider the case
The convolutive model introduces the following relation between the i-th mixed sig-
nal and the original source signals
The mixed signal is a linear mixture of ﬁltered versions of the source signals, aij[k]
represents the k-th mixing ﬁlter coefﬁcient and Kis the number of ﬁlter taps. The task
is to estimate the independentcomponents from the observations without resorting to a
priori knowledgeabout the mixing system and obtaining an estimate u[n]of the original
source vector s[n]:
where wij[l]denotes the l-th mixing ﬁlter coefﬁcient and Lis the number of ﬁlter taps.
The weights wij[l]can be adapted by minimizing some suitable cost function. A
particular good choice is to maximize joint entropy or, equivalently, to minimize the
mutual information [1,3]. Different approaches can be used, for example implementing
the de-mixing algorithm in the time domain or in the frequency domain. In this paper we
adopt the time-domain approach, using the algorithm proposed by Torkkola in  and
based on the feedback network shown in Figure 1 (for the particular case of M=N=2)
and described mathematically by:
This latter will be used in the paper for achieving source separation in time domain by
maximizing the joint entropy of a nonlinear transformation of network output: yi[n]=
f(ui[n]), with f(·)a suitable nonlinear function, very close to the source cumulative
density function [1,8]. In this work we use f(·)=tanh(·).Thek-th weight of the de-
mixing ﬁlter wij[k]is adapted by using the general rule:
where in particular the stochastic gradient method can be used,
is the learning rate
and pis the iteration index.
Proportionate Algorithms for Blind Source Separation 101
Fig. 1. The architecture for BSS in time domain in the particular case of M=N=2
3 The Proposed Algorithm
Standard BSS algorithms were proposed so far in several works [3,9,10,8]. More re-
cently, some authors have underlined the importance of sparseness of the impulse re-
sponse aij[n]. In order to improve the performance of standard adaptive algorithms
(LMS and NLMS ), that do not take into account sparseness, these authors proposed
some modiﬁcations introducing the so-called Proportionate NLMS (PNLMS)  and
Improved Proportionate NLMS (IPNLMS) . The resulting algorithms derive from a
more general class of regularized gradient adaptive algorithms [12,13].
In this kind of algorithms, the update term is simply multiplied with a matrix G[k],
whose entries gij[k]are chosen using different criteria (NLMS, PNLMS and IPNLMS)
and take into account the sparseness of the impulse response: the parameters variation
is proportional to the impulse response itself
The aim of this section is to extend the previous ideas on proportionate adaptive algo-
rithms to time-domain BSS algorithm proposed by Torkkola , deriving a new updat-
ing rule for the de-mixing ﬁlter matrix W[k].
Based on  and , the proposed modiﬁcation to the Torkkola’s algorithm results
in the following algorithm:
where yi[n]= f(ui[n]) = tanh(ui[n]).
102 M. Scarpiniti et al.
The k-th parameter gij[k]in the case of Proportionate BSS (PBSS) are chosen as
k, wij ,..., wij[L−1] , wij[k] $,
A second proposal is the following Improved PBSS (IPBSS) choice for the k-th
≤1 is a constant.
4 Experimental Results
Some experimental results are proposed using two speech signals sampled at 8 kHz.
Three different synthetic mixing weight sets were proposed and a total of 5000 samples
The ﬁrst set of weights is a very simple and sparse set given by
A more dense set of weights is used in the second set, perhaps more realistically in
a room that produces notable reverberation, where it was imagined that you have one
microphone closer to the ﬁrst source and another microphone close to the other, which
aij[5n−1]=exp(−n),for n=1,...,120 and i,j=1,2(11)
that is, we have the strongest input respectively from the two sources in the input, with
subsequent echoes every ﬁve taps of the ﬁlter that decay exponentially.
The third set has only non-zero elements for the length of the ﬁlter, but is otherwise
similar to the previous weights, and is meant to simulate the same situation:
aij[n]=exp(−(1+0.4n)),for n=1,...,200 and i,j=1,2(12)
Furthermore, the de-mixing weights are not updated for each iteration of the learning
rule, rather we sum the contributions from the learning rule over 150 iterations between
Proportionate Algorithms for Blind Source Separation 103
each update of the ﬁlter for a robust estimation. When using the Gmatrix from the
PBSS algorithm, we always use
k=0.01, as suggested as good values
by . When using IPBSS, we always use the parameter
=0.1. In both cases we use a
de-mixing ﬁlter length of L=200 samples and the learningrate is set to
We will face the more complicated problem of separating real world mixtures in a future
Now, we ask ourself the natural question: Does the proposed method using a G
matrix from either the PBSS method or the IPBSS method improve the results from the
standard algorithm? The answer found, is that it depends on the ﬁlter length. Or likely,
more generally, how sparse the ﬁlter is.
In order to evaluate the convergence of the algorithm, an estimate Iof the mutual
information is used as convergence and performance index [14,15].
4.1 Results of Proportionate Algorithm
As we will see in this section, there are improvements in the convergence speed overthe
standard algorithm. Figure 2 shows the convergence history using the estimated mutual
information, and we can clearly see that there is a considerable speed-up in conver-
gence: already at epoch number 10, we can see that using the PBSS has almost reached
convergence, while the standard algorithm does not reach this level until around epoch
number 30. At epoch number 20, using PBSS, the algorithm has practically reached
convergence, while convergence is reached for the standard algorithm before around
epoch number 80.
010 20 30 40 50 60 70 80
PBSS − weight set 1
standard BSS − weight set 1
PBSS − weight set 2
standard BSS − weight set 2
PBSS − weight set 3
standard BSS − weight set 3
Fig. 2. Comparison of the standard algorithm to the algorithm with the Gmatrix from the PBSS
104 M. Scarpiniti et al.
This suggests that the algorithm converges between 3 and 4 times faster when us-
ing the Gmatrix from the PBSS method, which shows that by adaptively changing the
learning rate for the different taps of the ﬁlter, we get a signiﬁcantly better performance.
As it can be seen in Figure 2, at convergence the solution of both algorithms reaches
the same level of mutual information. It would be interesting to see if there are any
differences, looking at the recovered de-mixing ﬁlters directly. Figure 3 shows the so-
lution at convergence for both methodsusing the ﬁrst weight set, and we easily see that
the main features of both solutions are present in both ﬁlters and are approximately the
same ﬁlter taps, albeit at slightly different scalings. Except for that, the are other small
differences, but it seems to be mostly noise in the ﬁlter. Qualitatively, from hearing the
de-mixing of the the mixed sound ﬁles, we could discern no differences between the
two solutions at convergence. Thus it seems reasonable to conclude that both methods
eventually converge to the same solution, but that using the Gmatrix from the PBSS
method, we obtain convergence and a good solution signiﬁcantly faster.
0 100 200
0 100 200
Fig. 3. Solution at convergence using the ﬁrst set of weights, for a) standard algorithm, and b)
4.2 Results of Improved Proportionate Algorithm
Similar results can be obtained from the IPBSS method. Figure 4 shows theconvergence
history of the standard algorithm versus the proposed algorithm using the Gmatrix from
the IPBSS method. As we can see, the convergence is signiﬁcantly faster than what is
possible with the standard algorithm. It is interesting to note also that there is more
noise in the mutual information at convergence with respect to the standard algorithm.
However, looking at the convergingsolution, there is once again not difference with the
standard algorithm. Qualitatively from listening to the solutions, the authors could not
hear any difference. We have therefore concluded that the difference is negligible and
might have been caused by the noise in the mutual information at convergence.
The IPBSS algorithm turned out to be slightly faster for these examples than the
PBSS one. From Figure 4 we can see from the value of the mutual information of the
proposed algorithm using IPBSS from the ﬁrst epoch, that the same valueis reached for
the standard algorithm between epoch 4 and 5, while for PBSS the value of the mutual
Proportionate Algorithms for Blind Source Separation 105
010 20 30 40 50 60 70 80
IPBSS − weight set 3
standard BSS − weight set 3
IPBSS − weight set 2
standard BSS − weight set 2
IPBSS − weight set 1
standard BSS − weight set 1
Fig. 4. Comparison of the standard algorithm to the algorithm with the Gmatrix from the IPBSS
information at the ﬁrst iteration is reached between epoch 2 and 3. A similar reasoning
can be made for other epochs of iterations.
As concluded in the previous section, that indicates that the proposed algorithm us-
ing the PBSS method increases the speed of convergence of the ﬁrst iterations of 3-4
times. From this, we can also infer that the IPBSS method further increases the speed
of convergence,suggesting an improvement of 20-25% over the PBSS algorithm.
In this paper some preliminary results on a proportionate and improved proportionate
version of the well-known time-domain Torkkola BSS algorithm have been proposed.
The proposed idea is to use the sparseness of the acoustic impulse responses of the
mixing environment, by applying some proportionate algorithm known in the context
of speech adaptive ﬁltering. Some experimental results have shown the effectiveness
of the proposed approach in terms of convergence speed and encourage us for a more
theoretical introduction of these novel classes of separation algorithms.
Acknowledgment. The authors would like to thank Mr. Gaute Halvorsen for his pre-
cious help in performing some of the presented experimental results.
106 M. Scarpiniti et al.
1. Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. John Wiley (2002)
2. Makino, S., Lee, T.W., Sawada, H.: Blind Speech Separation. Springer (2007)
3. Choi, S., Cichocki, A., Park, H.M., Lee, S.Y.: Blind source separation and independent com-
ponent analysis: a review. Neural Information Processing - Letters and Reviews 6(1), 1–57
4. Araki, S., Mukai, R., Makino, S., Nishikawa, T., Saruwatari, H.: The fundamental limita-
tion of frequency domain blind source separation for convolutive mixtures of speech. IEEE
Transactions on Speech and Audio Processing 11(2), 109–116 (2003)
5. Duttweiler, D.L.: Proportionate normalized least-mean-square adaptation in echo cancelers.
IEEE Transactions on Speech and Audio Processing 8, 508–518 (2000)
6. Huang, Y., Benesty, J., Chen, J.: Acoustic MIMO Signal Processing. Springer (2006)
7. Benesty, J., Gay, S.L.: An improved PNLMS algorithm. In: Proc. of IEEE International Con-
ference on Acoustics, Speech, and Signal Processing (ICASSP 2002), pp. 1881–1884 (2002)
8. Torkkola, K.: Blind separation of convolved sources based on information maximization. In:
Proc. of the 1996 IEEE Signal Processing Society Workshop on Neural Networks for Signal
Processing, September 4-6, pp. 423–432 (1996)
9. Torkkola, K.: Blind deconvolution, information maximization and recursive ﬁlters. In:
Proc. of 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP 1997), April 21-24, pp. 3301–3304 (1997)
10. Torkkola, K.: Blind separation of delayed sources based on information maximization. In:
Proc. of 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP 1996), May 7-10, pp. 3509–3512 (1996)
11. Haykin, S.: Adaptive Filter Theory, 4th edn. Prentice-Hall (2001)
12. Ouedraogo, W.S.B., Jaidane, M., Souloumiac, A., Jutten, C.: Regularized gradient algorithm
for non-negative independent component analysis. In: Proc. of IEEE International Confer-
ence on Acoustics, Speech and Signal Processing (ICASSP 2011), Prague, Czech Republic,
May 22-27, pp. 2524–2527 (2011)
13. Boulmezaoud, T.Z., El Rhabi, M., Fenniri, H., Moreau, E.: On convolutive blind source sep-
aration in a noisy context and a total variation regularization. In: Proc. of IEEE Eleventh In-
ternational Workshop on Signal Processing Advances in Wireless Communications (SPAWC
2010), Marrakech, June 20-23, pp. 1–5 (2010)
14. Masulli, F., Valentini, G.: Mutual information methods for evaluating dependence among
outputs in learning machines. Technical Report TR-01-02, Dipartimento di Informatica e
Scienze dell’Informazione, Universit`a di Genova (2001)
15. Torkkola, K.: Learning feature transforms is an easier problem than feature selection. In:
Proc. of 16th International Conference on Pattern Recognition, August 11-15, pp. 104–107