ArticlePDF Available

Separation Of Post-Nonlinear Mixtures Using ACE And Temporal Decorrelation

Authors:

Abstract

We propose an efficient method based on the concept of maximal correlation that reduces the post-nonlinear blind source separation problem (PNL BSS) to a linear BSS problem. For this we apply the Alternating Conditional Expectation (ACE) algorithm -- a powerful technique from nonparametric statistics -- to approximately invert the (post-)nonlinear functions. Interestingly, in the framework of the ACE method convergence can be proven and in the PNL BSS scenario the optimal transformation found by ACE will coincide with the desired inverse functions. After the nonlinearities have been removed by ACE, temporal decorrelation (TD) allows us to recover the source signals. An excellent performance underlines the validity of our approach and demonstrates the ACE-TD method on realistic examples. 1.
SEPARATION OF POST-NONLINEAR MIXTURES USING ACE AND TEMPORAL
DECORRELATION
Andreas Ziehe , Motoaki Kawanabe , Stefan Harmeling and Klaus-Robert M
¨
uller
GMD FIRST.IDA, Kekul´estr. 7, 12489 Berlin, Germany
University of Potsdam, Am Neuen Palais 10, 14469 Potsdam, Germany
ziehe,nabe,harmeli,klaus @first.gmd.de
ABSTRACT
We propose an efficient method based on the concept of
maximal correlation that reduces the post-nonlinear blind
source separation problem (PNL BSS) to a linear BSS prob-
lem. For this we apply the Alternating Conditional Expec-
tation (ACE) algorithm a powerful technique from non-
parametricstatisticstoapproximatelyinvertthe(post-)non-
linear functions. Interestingly, in the framework of the ACE
method convergence can be proven and in the PNL BSS
scenario the optimal transformation found by ACE will co-
incide with the desired inverse functions. After the non-
linearities have been removed by ACE, temporal decorrela-
tion (TD) allows us to recover the source signals. An ex-
cellent performance underlines the validity of our approach
and demonstrates the ACE-TD method on realistic exam-
ples.
1. INTRODUCTION
Blind source separation (BSS) research has mainly been
focused on variants of linear ICA and temporal decorrela-
tion methods (see e.g. [14, 6, 5, 7, 1, 2, 13, 29, 22, 12]).
Linear BSS assumes that at time each component
of the observed -dimensional data vector is a linear
combination of statistically independent signals:
(e.g. [12]). The source signals
are unknown, as are the coefficients of the mixing ma-
trix . The goal is therefore to estimate both unknowns
from the observed signals , i.e. a separating matrix
and signals that estimate .
However, non-linearities that distort the mixed signals,
pose a challenging problem for “conventional” BSS meth-
ods, where the mixing model is linear instantaneous or con-
volutive. The general nonlinear mixing model is (cf. [12])
(1)
To whom correspondence should be addressed.
This work was partly supported by the EU under contract IST-1999-
14190 – BLISS.
where is an arbitrarynonlineartransformation(at least ap-
proximately invertible). An important special case are post-
nonlinear (PNL) mixtures
(2)
where is an invertible nonlinear function that operates
componentwise and is a linear mixing matrix. Because
this PNL model, which has been introduced by Taleb and
Jutten [25], is an important subclass with interesting prop-
erties it attracted the interest of several researchers [25, 15,
27]. Furthermore it is often an adequate modelling of real-
world physical systems, where nonlinear transfer functions
appear; e.g. in the fields of telecommunications or biomedi-
cal data recording sensors can have a nonlinear characteris-
tics.
Fig. 1: Building blocks of the PNL mixing model and the separa-
tion system.
Algorithmic solutions of eq.(2) have used e.g. self-or-
ganizing maps [20, 18], extensions of GTM [21], neural
networks [27, 19], parametric sigmoidal functions [16] or
ensemble learning [26] to approximate the nonlinearity
(or its inverse ). Also kernel based methods were tried on
very simple toy signals [8] and more recently also on real-
world data using temporal decorrelation in feature space
[10]. Note, that most existing methods (except [10]) are
of high computational cost and depending on the algorithm
are prone to run into local minima.
In our approach to the PNL BSS problem we first ap-
proximately invert the post-nonlinearity using the ACE al-
gorithm (estimating ) and then apply a standard BSS tech-
433
nique [3, 29] that relies on temporal decorrelation (estimat-
ing the unmixing matrix ) (cf. Fig.1). By virtue of the
ACE framework, which is briefly introduced in subsection
2.2, we prove that the algorithm convergesto the correct in-
verse nonlinearities – provided that they exist. Some imple-
mentation issues are discussed and numerical simulations
illustrating the method are described in section 3. Finally a
conclusion is given in section 4.
2. METHODS
For the sake of simplicity we introduce our method for the
case. The extension to the general case is easily pos-
sible, but omitted for better readability.
2.1. Problem statement
Let us consider the dimensional post-nonlinear mixing
model:
where and are independent source signals, that are
temporally correlated, and are the observed signals,
is the mixing matrix and and are the com-
ponentwise nonlinear transformations which are invertible.
Obviously, any attempt to separate such a mixture by a
linear BSS algorithm will fail, unless one could invert the
functions and at least approximately. In this work we
propose that this can be achieved by maximizing the corre-
lation
corr (3)
with respect to nonlinear functions and . This means,
we want to find transformations and of the observed
signals such that the relationship between the transformed
variables becomes linear. Intuitively speaking, the relation-
ship is linear, if the signals are aligned in a scatterplot, i.e. if
theyare maximally correlated. Under certain conditionsthat
we will state in detail later, this problem is solved by the
ACE method that finds so called optimal transformations
and which maximize eq.(3). One can prove existence
and uniqueness of those optimal transformations and it can
be shown that the ACE algorithm, which is described in the
following, converges to these solutions (cf. [4]).
2.2. ACE algorithm
The ACE algorithm is an iterative procedure for finding the
optimal nonlinear functions and . The starting point is
the observation that for fixed the optimal is given by
and conversely, for fixed the optimal is
The key idea of the ACE algorithm is therefore to compute
alternately the respectiveconditional expectations. To avoid
trivialsolutions one normalizes in each step by using
the function norm . The algorithm for
two variables is summarized below. It is also possible to
extend the procedure to the multivariate case, however, for
further details we refer to [11, 4].
Algorithm 1 The ACE algorithm for two variables
initialize
repeat
until fails to decrease
An important point in the implementation of this algo-
rithm is the estimation of the conditional expectations from
the data. Usually, the conditional expectations are com-
puted by data smoothing for which numerous techniques
exist (cf. [4, 9]). Care has to be taken to balance the trade-
offbetween the fidelity to the data against the smoothness of
the estimated curve. Our implementation utilizes a nearest
neighbor smoothing that applies a simple moving average
filter to appropriately sorted data.
By applying and to the mixedsignals and we
remove the effect of the nonlinear functions and . In
the following we will substantiate this claim more formally.
We show for and
that and obtained from the ACE procedure are the de-
sired inverse functions for the case that and are jointly
normal distributed,with other words we provethe following
relationship:
(4)
Almost all work for the proof has already been done in
Proposition 5.4. and Theorem 5.3. of [4] which – by notic-
ing that the correlationof two signals does notchange, if we
scale one or both signals – implies:
Note that the conditional expectation is
a function of and the expectation is taken with respect to
, analogously for the second expression.
Since and , furthermore
and we get:
434
Because and are invertible functions they can be omit-
ted in the condition of the conditional expectation, leading
us to:
(5)
Assuming that the vector is normally distributed
and the correlation corr does not vanish, a straight-
forward calculation shows
This means that and satisfy eq. (5), which then imme-
diately implies our claim eq. (4). Fortunately, in our appli-
cation the above assumptions are usually fulfilled because
mixed signals are more Gaussian and more correlated than
unmixed signals. On the other hand, even if the assump-
tions are not perfectly met, experiments show that the ACE
algorithm still equalizes the nonlinearities well.
Summarizing the key idea, by searching for nonlinear
transformations, that maximize the linear correlations be-
tween the non-linearly transformed observed variables, we
can approximate the inverses of the post-nonlinearities.
2.3. Source separation
For a separation of the signals one could in principle ap-
ply any BSS technique, capable of solving the now approx-
imately linear problem. However, experiments show that
only second-order methods which use temporal informa-
tion are sufficiently robust to reliably recover the sources.
Therefore we use TDSEP, an implementation based on the
simultaneous diagonalizationof several time-delayed corre-
lation matrices for the blind identification of the unmixing
matrix (cf. [3, 29, 28]).
3. NUMERICAL SIMULATIONS
To demonstrate the performance of the proposed method we
applyour algorithm to severalpost-nonlinearmixtures, both
instantaneous and convolutive.
The first data set consists of Gaussian AR-processes of
the form:
(6)
where is white Gaussian noise with mean zero and vari-
ance . For the experiment we choose , ,
and generate 2000 data points.
We use a mixing matrix to get linearly mixed sig-
nals and apply strong nonlinear distortions
(7)
which were also used by Taleb and Jutten in [24]. The dis-
tributionof these mixedsignals has a highly nonlinearstruc-
ture as visible in the scatter plot in Fig. 2.
200 400 600 800 1000 1200 1400 1600 1800 2000
2
1
s
200 400 600 800 1000 1200 1400 1600 1800 2000
2
1
u
lin
200 400 600 800 1000 1200 1400 1600 1800 2000
2
1
u
nonlin
−1 0 1
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
2
x
1
(a) (b)
Fig. 2: (a) Scatter plot of the mixed AR-processes ( vs )
and (b) waveforms of the original sources (top), the linearly un-
mixed signals (middle) and recovered sources (bottom).
−1 −0.5 0 0.5 1
−1
−0.5
0
0.5
1
f
1
−1 −0.5 0 0.5 1
−1
−0.5
0
0.5
1
f
2
(a)
−4 −2 0 2 4
−2
−1
0
1
2
g
1
−1.5 −1 −0.5 0 0.5 1 1.5
−1
−0.5
0
0.5
1
g
2
(b)
Fig. 3: (a) Nonlinear functions and . (b) True (thin line) and
estimated (bold line) inverse functions and .
The application of the ACE algorithm using a local
nearest neighbor smoother (window length 31) for the con-
ditional expectation yields the estimated nonlinear func-
tions and shown in Fig. 3. We see that the true in-
versesof the nonlinearities and are approximatedwell.
Although the match is not perfect (could be optimized by
better smoothers) it is now possible to separate the signals
using the TDSEP algorithm, where 20 time-delayed corre-
lation matrices are simultaneously diagonalized (time lags
). Figure 2 (b) shows that the waveforms of the
recovered sources closely resemble the original ones, while
the result of the linear unmixing of the PNL mixture can
clearly not recover the sources. This is also confirmed by
comparing the output distributions that are shown in Fig. 4
as a scatter plot.
One favorable property of our method is its nice scaling
behavior. To show this, we will now test the algorithm with
435
−0.05 0 0.05
−0.1
−0.05
0
0.05
0.1
u
1
u
2
Fig. 4: Scatter plot of the output distribution of a linear (‘+’) and
the proposed nonlinear ACE-TD algorithm (‘.’).
natural audio sources, where the input data set consists of 4
sound signals with data points each. For this case
we apply the multivariate version of the ACE algorithm,
which computes the optimal functions by maximizing the
generalizedcorrelationcriterioncorr .
For details of the implementation we refer to [4, 11, 9]. As
in the first experiment, these source signals were mixed by
a linear model , with a random ( ) matrix
. After the linear mixingthe followingnonlinearities were
applied:
(8)
Figure 5 shows the results of the separation using ACE-TD
(smoothingwindowlength 51) and TDSEP ( ). We
observe again a very good separation performance that is
quantified by calculating the correlationcoefficients (shown
in table 1) between the source signals and the extracted
components. This is also confirmed by listening to the sep-
arated audio signals, were we perceive almost no crosstalk,
although the noise level is slightly increased (cf. the silent
parts of signal 2 in Fig. 5).
The third experiment gives an example for the applica-
tion of our method to convolutive mixtures with a PNL dis-
tortion. We deliberately distorted real-room recordings
1
of
speech and background music made by Lee [17] with non-
linear transfer functions as in our first example (cf. eq.(7)).
For the separation we apply a convolutive BSS algorithm
of Parra et al. that requires only second-order statistics by
1
Available on the internet via
http://sloan.salk.edu/˜tewon/Blind/blind audio.html
4
3
2
1
4
3
2
1
4
3
2
1
0
20000
(a)
(b)
(c)
20000
20000
0
0
Fig. 5: Four channel audio dataset: (a) waveforms of the original
sources, (b) linearly unmixed signals with TDSEP and (c) recov-
ered sources using ACE-TD.
exploiting the non-stationarity of the signals [23]. While
an unmixing of the distorted recordings obviously fails, we
could achieve a good separation after the unsupervised lin-
earization by the ACE procedure (cf. Fig. 6).
4. DISCUSSION AND CONCLUSION
In this work we proposed a simple technique for the blind
separation of linear mixtures with a post-nonlinear distor-
tion. The main ingredients of our algorithm, which we call
ACE-TD, are: first, a search for nonlinear transformations
that maximize the linear correlations between transformed
variables and which approximate the inverses of the PNLs.
This search can be done highly efficient by the ACE tech-
nique [4] from non-parametric statistics, that performs an
alternatingestimation ofconditionalexpectationsby smooth-
ing of scatter plots. Effectively, this nonlinear modeling
procedure solves the PNL mixture problem by transform-
ing it back into a linear one. Therefore, second, a temporal
decorrelation BSS algorithm (e.g. [3, 29]) can be applied.
436
TDSEP
0.10 0.56 0.31 -0.13
-0.01 0.26 0.02 0.47
0.06 0.12 0.76 -0.05
-0.07 0.66 -0.21 0.11
ACE-TD
0.97 -.01 -.005 0.03
0.03 0.94 -0.02 -0.005
0.01 0.07 0.95 -0.007
0.04 0.002 0.001 0.96
Table 1: Correlation coefficients for the signals shown in Fig. 5
Clearly, ACE is not limited to the case but it
scales naturally to the case for which an algorith-
mic description can be found in [4, 9]. Moreover, the algo-
rithm can make beneficial use of additional sensors in the
overdetermined BSS case as then the joint distribution of
becomes more and more Gaussian, which is beneficial
for ACE. Furthermore, our method works also for convolu-
tive mixtures, which is attractive for real-room BSS, where
nonlinear transfer functions of the sensors (microphones)
or amplifiers would impede a proper separation. Conclud-
ing, the proposed framework gives a simple algorithm of
high efficiency with a solid theoretical background for sig-
nal separation in applications with a PNL distortion, that are
of importance e.g. in real-world sensor technology.
Future research will be concerned with a better tuning
of the smoothers which are essential in the ACE algorithm
to the PNL blind source separation scenario.
5. REFERENCES
[1] S.-I.Amari,A.Cichocki,andH.H. Yang. A newlearn-
ing algorithm for blind source separation. In Advances
in Neural Information Processing Systems 8, pages
757–763. MIT Press, 1996.
[2] A.J. Bell and T.J. Sejnowski. An information-
maximization approach to blind separation and blind
deconvolution. Neural Computation, 7:1129–1159,
1995.
[3] A. Belouchrani, K. Abed Meraim, J.-F. Cardoso, and
E. Moulines. A blind source separation technique
based on second order statistics. IEEE Trans. on Sig-
nal Processing, 45(2):434–444, 1997.
[4] L. Breiman and J. H. Friedman. Estimating optimal
transformations for multiple regression and correla-
tion. Journal of the American Statistical Association,
80(391):580–598,September 1985.
2
1
2
1
2
1
2
1
2
1
(a)
(b)
(c)
(d)
(e)
Fig. 6: Two channel auido dataset: (a) waveforms of the recorded
(undistorted) microphone signals, (b) observed PNL distorted sig-
nals, (c)result ofACE, (d) recovered sources using ACE and a con-
volutive BBS algorithm and (e) for comparison convolutive BSS
separation result for undistorted signals from (a).
[5] J.-F. Cardoso and A. Souloumiac. Blind beamform-
ing for non Gaussian signals. IEE Proceedings-F,
140(6):362–370,1993.
[6] P. Comon. Independent component analysis—a new
concept? Signal Processing, 36:287–314,1994.
[7] G. Deco and D. Obradovic. Linear redundancy reduc-
tion learning. Neural Networks, 8(5):751–755, 1995.
[8] C. Fyfe and P. L. Lai. ICA using kernel canonical cor-
relation analysis. In Proc. Int. Workshop on Indepen-
dent Component Analysis and Blind Signal Separation
(ICA2000), pages 279–284, Helsinki, Finland, 2000.
[9] W. H¨ardle. Applied Nonparametric Regression. Cam-
bridge University Press, Cambridge, 1990.
437
[10] S. Harmeling, A. Ziehe, M. Kawanabe, B. Blankertz,
and K.-R. M¨uller. Nonlinear blind source separation
using kernel feature spaces. submitted to ICA 2001.
[11] T.J. Hastie and R.J. Tibshirani. Generalized Additive
Models, volume 43 of Monographs on Statistics and
Applied Probability. Chapman & Hall, London, 1990.
[12] A. Hyv¨arinen, J. Karhunen, and E. Oja. Independent
Component Analysis. Wiley, 2001.
[13] A. Hyv¨arinen and E. Oja. A fast fixed-point algorithm
for independent component analysis. Neural Compu-
tation, 9(7):1483–1492,1997.
[14] C. Jutten and J. H´erault. Blind separation of sources,
part I: An adaptive algorithm based on neuromimetic
architecture. Signal Processing, 24:1–10, 1991.
[15] T.-W. Lee, B.U. Koehler, and R. Orglmeister. Blind
source separationof nonlinearmixing models. In Neu-
ral Networks for Signal Processing VII, pages 406–
415. IEEE Press, 1997.
[16] T-W. Lee, B.U. Koehler, and R. Orglmeister. Blind
source separation of nonlinear mixing models. IEEE
International Workshop on Neural Networks for Sig-
nal Processing, pages 406–415, September 1997.
[17] T-W. Lee, A. Ziehe, R. Orglmeister, and T. J. Se-
jnowski. Combining time-delayed decorrelation and
ICA: Towards solving the cocktail party problem. In
Proc. ICASSP98,volume2, pages1249–1252,Seattle,
1998.
[18] J. K. Lin, D. G. Grier, and J. D. Cowan. Faithful rep-
resentation of separable distributions. Neural Compu-
tation, 9(6):1305–1320,1997.
[19] G. Marques and L. Almeida. Separation of nonlin-
ear mixtures using pattern repulsion. In Proc. Int.
Workshop on Independent Component Analysis and
Signal Separation (ICA’99), pages 277–282, Aussois,
France, 1999.
[20] P. Pajunen, A. Hyv¨arinen, and J. Karhunen. Nonlin-
ear blind source separation by self-organizing maps.
In Proc. Int. Conf. on Neural Information Processing,
pages 1207–1210, Hong Kong, 1996.
[21] P. Pajunen and J. Karhunen. A maximum likelihood
approach to nonlinear blind source separation. In
Proceedings of the 1997 Int. Conf. on Artificial Neu-
ral Networks (ICANN’97), pages 541–546, Lausanne,
Switzerland, 1997.
[22] P. Pajunen and J. Karhunen, editors. Proc. of the
2nd Int. Workshop on Independent Component Anal-
ysis and Blind Signal Separation, Helsinki, Finland,
June 19-22, 2000. Otamedia, 2000.
[23] L. Parra and C. Spence. Convolutive blind source
separation of non-stationary sources. IEEE Trans. on
Speech and Audio Processing, 8(3):320–327, May
2000. US Patent US6167417.
[24] A. Taleb andC. Jutten. Batchalgorithmfor source sep-
aration in post-nonlinear mixtures. In Proc. First Int.
Workshop on Independent Component Analysis and
Signal Separation (ICA’99), pages 155–160, Aussois,
France, 1999.
[25] A. Taleb and C. Jutten. Source separation in post-
nonlinear mixtures. IEEE Trans. on Signal Process-
ing, 47(10):2807–2820,1999.
[26] H. Valpola, X. Giannakopoulos, A. Honkela, and
J. Karhunen. Nonlinear independent component anal-
ysis using ensemble learning: Experiments and dis-
cussion. In Proc. Int. Workshop on Independent
Component Analysis and Blind Signal Separation
(ICA2000), pages 351–356, Helsinki, Finland, 2000.
[27] H. H. Yang, S. Amari, and A. Cichocki. Information-
theoretic approach to blind separation of sources in
non-linear mixture. Signal Processing, 64(3):291–
300, 1998.
[28] A. Yeredor. Blind separation of gaussian sources
via second-orderstatistics with asymptoticallyoptimal
weighting. IEEE Signal Processing Letters, 7(7):197–
200, 2000.
[29] A. Ziehe and K.-R. M¨uller. TDSEP–an efficient
algorithm for blind separation using time structure.
In Proc. Int. Conf. on Artificial Neural Networks
(ICANN’98), pages 675–680, Sk¨ovde, Sweden, 1998.
438
... The slowness is measured using the variance of the first derivative. In [27] and [28], the authors exploit time correlation in the separation of post-nonlinear mixtures. In [27] an alternating conditional expectation algorithm is applied to approximately invert the post-nonlinear function, then a temporal decorrelation algorithm is used to recover the source signals. ...
... In [27] and [28], the authors exploit time correlation in the separation of post-nonlinear mixtures. In [27] an alternating conditional expectation algorithm is applied to approximately invert the post-nonlinear function, then a temporal decorrelation algorithm is used to recover the source signals. In [28], a similar method is proposed which replaces the first stage by a Gaussianizing transformation. ...
... Let's denote , and . If , then so that the cubic (25) has a unique real root defined by (26) The other source may then be obtained using (27) If , then the mixing (24) has a unique real solution . Thus, the mixture is globally bijective everywhere in . ...
Article
In this paper, we present a new method, formulated in a maximum-likelihood framework, for blindly separating nonlinear mixtures of statistically independent signals. Our method exploits, on the one hand, the knowledge of the parametric model of the mixing transformation (with unknown parameter values), and on the other hand, the possible structure of source signals, i.e., their autocorrelation and/or nonstationarity. One of the main advantages of the proposed method is that it can be implemented even if the analytical expression of the inverse model is unknown. The method is first addressed in a general configuration, then detailed for two special cases, i.e., a simple bijective “toy” model and a linear-quadratic model. The study of the toy model is interesting because of its simplicity and its global bijectivity, which allows us to focus our efforts on parameter estimation. The linear-quadratic model is chosen due to its capacity to describe real-world mixing phenomena. Simulation results, using the toy model and using a subclass of the linear-quadratic model (i.e., the bilinear model), show that taking into account the nonlinearity of the mixing transformations and the structure of signals considerably improves separation performance.
... Blind Separation of independent sources (BSS) is a basic problem in signal processing, which has been considered intensively in the last fifteen years, mainly for linear (instantaneous as well as convolutive) mixtures. More recently, a few researchers [1] [2] [3] [4] [5] [6] [7] [8] addressed the problem of source separation in nonlinear mixtures, whose observations aré×µ. Especially Taleb and Jutten [6] have studied a special and realistic case of nonlinear mixtures, called post nonlinear (PNL) mixtures which are separable. ...
... Blind Separation of independent sources (BSS) is a basic problem in signal processing, which has been considered intensively in the last fifteen years, mainly for linear (instantaneous as well as convolutive) mixtures. More recently, a few researchers [1, 2, 3, 4, 5, 6, 7, 8] addressed the problem of source separation in nonlinear mixtures, whose observations aré×µ. Especially Taleb and Jutten [6] have studied a special and realistic case of nonlinear mixtures, called post nonlinear (PNL) mixtures which are separable. ...
Article
Full-text available
This paper proposes a very simple method for increasing the algorithm speed for separating sources from PNL mix-tures or inverting Wiener systems. The method is based on a pertinent initialization of the inverse system, whose compu-tational cost is very low. The nonlinear part is roughly ap-proximated by pushing the observations to be Gaussian; this method provides a surprisingly good approximation even when the basic assumption is not fully satisfied. The linear part is initialized so that outputs are decorrelated. Experi-ments shows the impressive speed improvement.
... Such criterion often possesses the contrast property in the sense that it can be minimized if and only if the output of the separation system are mutually independent [1] [2]. In the context of linear mixtures, contrast functions can be constructed from cumulants [1] or even correlations if lagged correlations are included [3] [4]. This is possible because of the strong constraint of linearity of the mixture, since it is well known that the independence between a set of random variables cannot in general be inferred from the fact that some of their correlations and cumulants are zero. ...
... Such criterion often possesses the contrast property in the sense that it can be minimized if and only if the output of the separation system are mutually independent [1, 2]. In the context of linear mixtures, contrast functions can be constructed from cumulants [1] or even correlations if lagged correlations are included [3, 4] . This is possible because of the strong constraint of linearity of the mixture, since it is well known that the independence between a set of random variables cannot in general be inferred from the fact that some of their correlations and cumulants are zero. ...
Article
Full-text available
This work focuses on a quadratic dependence measure which can be used for blind source separation. After defining it, we show some links with other quadratic dependence measures used by Feuerverger and Rosenblatt. We develop a prac-tical way for computing this measure, which leads us to a new solution for blind source separation in the case of non-linear mixtures. It consists in first estimating the theoreti-cal quadratic measure, then computing its relative gradient, finally minimizing it through a gradient descent method. Some examples illustrate our method in the post nonlinear mixtures.
... Blind Separation of independent sources (BSS) is a basic problem in signal processing , which has been considered intensively in the last fifteen years, mainly for linear (instantaneous as well as convolutive) mixtures. More recently, a few researchers12345678910 addressed the problem of source separation in nonlinear mixtures , whose observations are e = f (s). Especially Taleb and Jutten [8] have studied a special and realistic case of nonlinear mixtures, called post nonlinear (PNL) mixtures which are separable. ...
... The Matlab code is very simple and very fast. A second algorithm, based on the result on Subsection 2.3, consists in adjusting a nonlinear mapping g so that the Shannon's entropy of z is maximum under the constraint Ez 2 = 1 (see [10] for a similar work). We can parametrize the nonlinear function g, for example by means of neural networks (multylayer perceptron), as showed inFig. ...
Conference Paper
Full-text available
This paper proposes a very fast method for blindly initial- izing a nonlinear mapping which transforms a sum of random variables. The method provides a surprisingly good approximation even when the basic assumption is not fully satisfled. The method can been used success- fully for initializing nonlinearity in post-nonlinear mixtures or in Wiener system inversion, for improving algorithm speed and convergence.
... Blind Separation of independent sources (BSS) is a basic problem in signal processing, which has been considered intensively in the last fifteen years, mainly for linear (instantaneous as well as convolutive) mixtures. More recently, a few researchers [1,2,3,4,5,6,7,8] addressed the problem of source separation in nonlinear mixtures, whose observations are ´×µ. Especially Taleb and Jutten [6] have studied a special and realistic case of nonlinear mixtures, called post nonlinear (PNL) mixtures which are separable. ...
Article
Full-text available
This paper proposes a very simple method for increasing the algorithm speed for separating sources from PNL mixtures or inverting Wiener systems. The method is based on a pertinent initialization of the inverse system, whose computational cost is very low. The nonlinear part is roughly approximated by pushing the observations to be Gaussian; this method provides a surprisingly good approximation even when the basic assumption is not fully satisfied. The linear part is initialized so that outputs are decorrelated. Experiments shows the impressive speed improvement.
... Then by minimizing the outputs' mutual information with respect to the PNL mixture parameters, the estimates of the sources are constructed. In [3] and in [4] Ziehe et al. resolve the non-linearity by either maximizing the outputs' correlation or by bringing the outputs to be as Gaussian as possible. A similar idea of trying to " Gaussianize " the sources using non-linear functions, is also used in [5] as an initialization procedure for PNL BSS algorithms. ...
Conference Paper
We consider the problem of blind estimation of the parameters of noisy non-linear mixtures of sources with unknown discrete alphabets. The nonlinear mixtures are modeled using the "post non-linear" model, in which the source signal undergo a linear mixture first, and then each mixed signal undergoes an unknown nonlinear transformation. The individual nonlinear transformations are modeled in this paper as second-order polynomials, whose parameters are unknown. Using the estimate-maximize algorithm, we derive estimators for all the unknown parameters. We also computed the Cramer-Rao lower bound for the estimation, to which the obtained mean squared estimation error is empirically compared
... A lot of work remains to be done in studying the nonlinear ICA and BSS problems. First, regularization methods based on constraints can be studied further , but other approaches, especially incorporation of temporal statistics [49] (only sketched in this paper) and variational Bayesian ensemble learning [45] . Secondly , remember a better modeling of the relationship between the independent components or sources and the observations is essential for choosing a suitable separation structure and subsequently for studying separability. ...
Article
In this paper, we consider the nonlinear Blind Source Separation BSS and independent component analysis (ICA) problems, and especially uniqueness issues, presenting some new results. A fundamental difficulty in the nonlinear BSS problem and even more so in the nonlinear ICA problem is that they are nonunique without a suitable regularization. In this paper, we mainly discuss three different ways for regularizing the solutions, that have been recently explored.
Article
This work focuses on solving the problem of Blind Source Separation (BSS) using Independent Component Analysis (ICA) method for nonlinear mixtures. Since ICA methods require a dependence measure, we will investigate the use of mutual information and quadratic dependence. Mutual information has already often been used for solving BSS problem, but difficulties occur in order to carry out an asymptotic study. In contrast, the quadratic dependence was introduced recently and has already been used for independence tests. Finally, the difficulty of solving the BSS problem is illustrated through examples of the shape of the objective-functions.
Article
According to the development of nonlinear blind source separation research, the post-nonlinear mixture is taken as an indraft point to summarize its algorithms. The model of post-nonlinear mixture and its separability are presented, meanwhile the existence and nonuniqueness of post-nonlinear blind source separation are discussed. Then the methods are summarized, and the independence criterion based on the minimization of mutual information is introduced, also the representative algorithms proposed continuously in recent years are analyzed and commented. Finally, the existing problems and development tendency on the research of post-nonlinear blind source separation are generalized and expected.
Article
In this paper we propose a separating system for convolutive nonlinear mixtures, since it is rare to find linear instantaneous mixture in practical problems. The observed mixtures are first transformed into frequency domain and then separated on each frequency bin. After solved permutation problem, estimates are obtained. We show results of simulations with acoustic signals both of instantaneous and convolutive mixtures.
Conference Paper
Full-text available
We present methods to separate blindly mixed signals recorded in a room. The learning algorithm is based on the information maximization in a single layer neural network. We focus on the implementation of the learning algorithm and on issues that arise when separating speakers in room recordings. We used an infomax approach in a feedforward neural network implemented in the frequency domain using the polynomial filter matrix algebra technique. A fast convergence speed was achieved by using a time-delayed decorrelation method as a preprocessing step. Under minimum-phase mixing conditions this preprocessing step was sufficient for the separation of signals. These methods successfully separated a recorded voice with music in the background (cocktail party problem). Finally, we discuss problems that arise in real world recordings and their potential solutions
Article
A geometric approach to data representation incorporating information theoretic ideas is presented. The task of finding a faithful representation, where the input distribution is evenly partitioned into regions of equal mass, is addressed. For input consisting of mixtures of statistically independent sources, we treat independent component analysis (ICA) as a computational geometry problem. First, we consider the separation of sources with sharply peaked distribution functions, where the ICA problem becomes that of finding high-density directions in the input distribution. Second, we consider the more general problem for arbitrary input distributions, where ICA is transformed into the task of finding an aligned equipartition. By modifying the Kohonen self-organized feature maps, we arrive at neural networks with local interactions that optimize coding while simultaneously performing source separation. The local nature of our approach results in networks with nonlinear ICA capabilities.
Book
A comprehensive introduction to ICA for students and practitionersIndependent Component Analysis (ICA) is one of the most exciting new topics in fields such as neural networks, advanced statistics, and signal processing. This is the first book to provide a comprehensive introduction to this new technique complete with the fundamental mathematical background needed to understand and utilize it. It offers a general overview of the basics of ICA, important solutions and algorithms, and in-depth coverage of new applications in image processing, telecommunications, audio signal processing, and more.Independent Component Analysis is divided into four sections that cover:* General mathematical concepts utilized in the book* The basic ICA model and its solution* Various extensions of the basic ICA model* Real-world applications for ICA modelsAuthors Hyvarinen, Karhunen, and Oja are well known for their contributions to the development of ICA and here cover all the relevant theory, new algorithms, and applications in various fields. Researchers, students, and practitioners from a variety of disciplines will find this accessible volume both helpful and informative.
Article
Feature extraction from any combination of sensory stimuli can be seen as a detection of statistically correlated combination of inputs. A mathematical framework that describes this fact is formulated using concepts of the Information Theory. The key idea is to define a bijective transformation that conserves the volume in order to assure the transmission of all the information from inputs to outputs without spurious generation of entropy. In addition, this transformation simultaneously constrains the distribution of the outputs so that the representation is factorial, i.e., the redundancy at the output layer is minimal. We formulate this novel unsupervised learning paradigm for a linear network. The method converges in the linear case to the principal component transformation. Contrary to the “infomax” principle, we minimize the mutual information between the output neurons provided that the transformation conserves the entropy in the vertical sense (from input to outputs).
Article
The linear mixture model is assumed in most of the papers devoted to blind separation. A more realistic model for mixture should be non-linear. In this paper, a two-layer perceptron is used as a de-mixing system to separate sources in non-linear mixture. The learning algorithms for the de-mixing system are derived by two approaches: maximum entropy and minimum mutual information. The algorithms derived from the two approaches have a common structure. The new learning equations for the hidden layer are different from the learning equations for the output layer. The natural gradient descent method is applied in maximizing entropy and minimizing mutual information. The information (entropy or mutual information) back-propagation method is proposed to derive the learning equations for the hidden layer.