Conference Proceeding

Audio source separation based on independent component analysis.

NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
01/2004; DOI:10.1109/ISCAS.2004.1329896 pp.668-671 In proceeding of: Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on, Volume: 5
Source: DBLP

ABSTRACT This paper introduces the blind source separation (BSS) of convolutive mixtures of acoustic signals, especially speech. A statistical and computational technique, called independent component analysis (ICA), is examined. By achieving nonlinear decorrelation, nonstationary decorrelation, or time-delayed decorrelation, we can find source signals only from observed mixed signals. Particular attention is paid to the physical interpretation of BSS from the acoustical signal processing point of view. Frequency-domain BSS is shown to be equivalent to two sets of frequency domain adaptive microphone arrays, i.e., adaptive beamformers (ABFs). Although BSS can reduce reverberant sounds to some extent in the same way as ABF, it mainly removes the sound from the jammer direction. This is why BSS has difficulties with long reverberation in the real world. If sources are not "independent," the dependence results in bias noise when obtaining the correct unmixing filter coefficients. Therefore, the performance of BSS is limited by that of ABF. Although BSS is upper bounded by ABF, BSS has a strong advantage over ABF. BSS can be regarded as an intelligent version of ABF in the sense that it can adapt without any information on the array manifold or the target direction, and sources can be simultaneously active in BSS.

0 0
 · 
0 Bookmarks
 · 
47 Views
  • Conference Proceeding: Blind source separation of real world signals
    [show abstract] [hide abstract]
    ABSTRACT: We present a method to separate and deconvolve sources which have been recorded in real environments. The use of noncausal FIR filters allows us to deal with nonminimum mixing systems. The learning rules can be derived from different viewpoints such as information maximization, maximum likelihood and negentropy which result in similar rules for the weight update. We transform the learning rule into the frequency domain where the convolution and deconvolution property becomes a multiplication and division operation. In particular the FIR polynomial algebra techniques as used by Lambert present an efficient tool to solve true phase inverse systems allowing a simple implementation of noncausal filters. The significance of the methods is shown by the successful separation of two voices and separating a voice that has been recorded with loud music in the background. The recognition rate of an automatic speech recognition system is increased after separating the speech signals
    Neural Networks,1997., International Conference on; 07/1997
  • Source
    Article: The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech
    [show abstract] [hide abstract]
    ABSTRACT: Despite several recent proposals to achieve blind source separation (BSS) for realistic acoustic signals, the separation performance is still not good enough. In particular, when the impulse responses are long, performance is highly limited. In this paper, we consider a two-input, two-output convolutive BSS problem. First, we show that it is not good to be constrained by the condition T>P, where T is the frame length of the DFT and P is the length of the room impulse responses. We show that there is an optimum frame size that is determined by the trade-off between maintaining the number of samples in each frequency bin to estimate statistics and covering the whole reverberation. We also clarify the reason for the poor performance of BSS in long reverberant environments, highlighting that the framework of BSS works as two sets of frequency-domain adaptive beamformers. Although BSS can reduce reverberant sounds to some extent like adaptive beamformers, they mainly remove the sounds from the jammer direction. This is the reason for the difficulty of BSS in reverberant environments.
    IEEE Transactions on Speech and Audio Processing 04/2003; · 2.29 Impact Factor
  • Source
    Article: Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures
    [show abstract] [hide abstract]
    ABSTRACT: Frequency-domain blind source separation (BSS) is shown to be equivalent to two sets of frequency-domain adaptive beamformers (ABFs) under certain conditions. The zero search of the off-diagonal components in the BSS update equation can be viewed as the minimization of the mean square error in the ABFs. The unmixing matrix of the BSS and the filter coefficients of the ABFs converge to the same solution if the two source signals are ideally independent. If they are dependent, this results in a bias for the correct unmixing filter coefficients. Therefore, the performance of the BSS is limited to that of the ABF if the ABF can use exact geometric information. This understanding gives an interpretation of BSS from a physical point of view.
    EURASIP Journal on Advances in Signal Processing. 01/2003;

Full-text (2 Sources)

View
3 Downloads
Available from
1 Feb 2013