R. Mukai

NTT DATA Corporation, Edo, Tōkyō, Japan

Are you R. Mukai?

Claim your profile

Publications (28)52.61 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a new formulation and optimization procedure for grouping frequency components in frequency-domain blind source separation (BSS). We adopt two separation techniques, independent component analysis (ICA) and time-frequency (T-F) masking, for the frequency-domain BSS. With ICA, grouping the frequency components corresponds to aligning the permutation ambiguity of the ICA solution in each frequency bin. With T-F masking, grouping the frequency components corresponds to classifying sensor observations in the time-frequency domain for individual sources. The grouping procedure is based on estimating anechoic propagation model parameters by analyzing ICA results or sensor observations. More specifically, the time delays of arrival and attenuations from a source to all sensors are estimated for each source. The focus of this paper includes the applicability of the proposed procedure for a situation with wide sensor spacing where spatial aliasing may occur. Experimental results show that the proposed procedure effectively separates two or three sources with several sensor configurations in a real room, as long as the room reverberation is moderately low.
    IEEE Transactions on Audio Speech and Language Processing 08/2007; · 1.68 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a new method for estimating the direction of arrival (DOA) of source signals whose number N can exceed the number of sensors M. Subspace based methods, e.g., the MUSIC algorithm, have been widely studied, however, they are only applicable when M > N. Another conventional independent component analysis based method allows M ges N, however, it cannot be applied when M < N. By contrast, our new method can be applied where the sources outnumber the sensors (i.e., an underdetermined case M < N) by assuming source sparseness. Our method can cope with 2- or 3-dimensionally distributed sources with a 2- or 3-dimensional sensor array. We obtained promising experimental results for 3 times 4, 3 times 5 and 4 times 5 (#sensors times #speech sources) in a room (RT<sub>60</sub>= 120 ms)
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a method for solving the permutation problem of frequency-domain blind source separation (BSS). The method analyzes the mixing system information estimated with independent component analysis (ICA). When we use widely spaced sensors or increase the sampling rate, spatial aliasing may occur for high frequencies due to the possibility of multiple cycles in the sensor spacing. In such cases, the estimated information would imply multiple possibilities for a source location. This causes some difficulty when analyzing the information. We propose a new method designed to overcome this difficulty. This method first estimates the model parameters for the mixing system at low frequencies where spatial aliasing does not occur, and then refines the estimations by using data at all frequencies. This refinement leads to precise parameter estimation and therefore precise permutation alignment. Experimental results show the effectiveness of the new method
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes the frequency-domain blind source separation (BSS) of convolutively mixed acoustic signals using independent component analysis (ICA). The most critical issue related to frequency domain BSS is the permutation problem. This paper presents two methods for solving this problem. Both methods are based on the clustering of information derived from a separation matrix obtained by ICA. The first method is based on direction of arrival (DOA) clustering. This approach is intuitive and easy to understand. The second method is based on normalized basis vector clustering. This method is less intuitive than the DOA based method, but it has several advantages. First, it does not need sensor array geometry information. Secondly, it can fully utilize the information contained in the separation matrix, since the clustering is performed in high-dimensional space. Experimental results show that our methods realize BSS in various situations such as the separation of many speech signals located in a 3-dimensional space, and the extraction of primary sound sources surrounded by many background interferences
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a prototype system for blind source separation (BSS) of many speech signals and describes the techniques used in the system. Our system uses 8 microphones located at the vertexes of a 4 cm times 4 cm times 4 cm cube and has the ability to separate signals distributed in three-dimensional space. The mixed signals observed by the microphone array are processed by independent component analysis (ICA) in the frequency domain and separated into a given number of signals (up to 8). We carried out experiments in an ordinary office and obtained more than 20 dB of SIR improvement
    Applications of Signal Processing to Audio and Acoustics, 2005. IEEE Workshop on; 11/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a method for estimating location information about multiple sources. The proposed method uses independent component analysis (ICA) as a main statistical tool. The near-field model as well as the far-field model can be assumed in this method. As an application of the method, we show experimental results for the direction-of-arrival (DOA) estimation of three sources that were positioned 3-dimensionally.
    Antennas and Propagation Society International Symposium, 2005 IEEE; 08/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a method for enhancing a dominant target source that is close to sensors, and suppressing other interferences. The enhancement is performed blindly, i.e. without knowing the number of total sources or information about each source, such as position and active time. We consider a general case where the number of sources is larger than the number of sensors. We employ a two-stage processing technique where a spatial filter is first employed in each frequency bin and time-frequency masking is then used to improve the performance further. To obtain the spatial filter we employ independent component analysis and then select the component of the target source. Time-frequency masks in the second stage are obtained by calculating the angle between the basis vector corresponding to the target source and a sample vector. The experimental results for a simulated cocktail party situation were very encouraging.
    Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on; 04/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Musical noise is a typical problem with blind source separation using a time-frequency mask. We report that a fine-shift and overlap-add method reduces the musical noise without degrading the separation performance. The effectiveness was confirmed by results of a listening test undertaken in a room with a reverberation time of RT<sub>60</sub>=130 ms.
    Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on; 04/2005
  • Source
    Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005); 03/2005
  • Journal of Systems, Control and Information. 10/2004; 48(10):401-408.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work presents a method for solving the permutation problem of frequency domain blind source separation (BSS) when the number of source signals is large, and the potential source locations are omnidirectional. We propose a combination of small and large spacing sensor pairs with various axis directions in order to obtain proper geometrical information for solving the permutation problem. Experimental results show that the proposed method can separate a mixture of speech signals that came from various directions, even when two of them come from the same direction.
    Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on; 06/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Blind source separation (BSS) for convolutive mixtures can be efficiently achieved in the frequency domain, where independent component analysis is performed separately in each frequency bin. However, frequency-domain BSS involves a permutation problem, which is well known as a difficult problem, especially when the number of sources is large. This paper presents a method for solving the permutation problem, which works well even for many sources. The successful solution for the permutation problem highlights another problem with frequency-domain BSS that arises from the circularity of the discrete frequency representation. This paper discusses the phenomena of the problem and presents a method for solving it. With these two methods, we can separate many sources with a practical execution time. Moreover, real-time processing is currently possible for up to three sources with our implementation.
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on; 06/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a method for separating speech signals when there are more signals than sensors. Several methods have already been proposed for solving the underdetermined problem, and some of these utilize the sparseness of speech signals. These methods employ binary masks to extract the signals, and therefore, their extracted signals contain loud musical noise. To overcome this problem, we propose combining a sparseness approach and independent component analysis (ICA). First, using sparseness, we estimate the time points when only one source is active. Then, we remove this single source from the observations and apply ICA to the remaining mixtures. Experimental results show that our proposed sparseness and ICA (SPICA) method can separate signals with little distortion even in reverberant conditions of T<sub>R</sub>=130 and 200 ms.
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on; 06/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The paper presents a method for solving the permutation problem of frequency domain blind source separation (BSS) when source signals come from the same or similar directions. Geometric information such as the direction of arrival (DOA) is helpful for solving the permutation problem, and a combination of the DOA based and correlation based methods provides a robust and precise solution. However, when signals come from similar directions, the DOA based approach fails, and we have to use only the correlation based method whose performance is unstable. We show that an interpretation of the ICA solution by a near-field model yields information about spheres on which source signals exist, which can be used as an alternative to the DOA. Experimental results show that the proposed method can robustly separate a mixture of signals arriving from the same direction.
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on; 06/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, single-input multiple-output (SIMO)-model-based blind source separation (BSS) is addressed, where unknown mixed source signals are detected at the microphones, and these signals can be separated, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at the microphones. This technique is highly applicable to high-fidelity signal processing such as binaural signal processing. First, we provide an experimental comparison between two kinds of the SIMO-model-based BSS methods, namely, traditional frequency-domain ICA with projection-back processing (FDICA-PB), and SIMO-ICA recently proposed by the authors. Secondly, we propose a new combination technique of the FDICA-PB and SIMO-ICA, which can achieve a higher separation performance in comparison to two methods. The experimental results reveal that the accuracy of the separated SIMO signals in the simple SIMO-ICA is inferior to that of FDICA-PB, but the proposed combination technique can outperform both simple FDICA-PB and SIMO-ICA.
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on; 06/2004
  • Proc. of 2004 NTT Workshop on Communication Scene Analysis; 04/2004
  • Source
    Proc. of the 18th International Congress on Acoustics (ICA 2004); 04/2004
  • 2004 NTT Workshop on Communication Scene Analysis; 01/2004
  • Proc. IWAENC2003; 09/2003
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Using algorithmic complexity to perform blind source separation (BSS) was first proposed by Pajunen. This approach presents the advantage of taking the whole signal structure into account to achieve separation, whereas standard ICA-based methods only use either time-correlations or higher order statistics in order to do so. Another advantage of this approach is that no assumptions about the probability distribution of the source signals need to be made. However, although algorithmic complexity based methods have been shown to outperform standard ICA algorithms in the instantaneous BSS case, they haven't been applied to convolutive BSS to the present date. In this paper, we show that it is also possible to use algorithmic complexity as a separating criterion to perform BSS for convolutive mixtures and suggest a method to do so. Testing the proposed method by computer simulation yielded results which are encouraging in terms of SNR performance.
    Proc. of the 8th International Workshop on Acoustic Echo and Noise Control (IWAENC 2003); 09/2003