Publications (6)0 Total impact
ABSTRACT: A new blind speech separation (BSS) method of convolutive mixtures is presented. This method uses a sample-by-sample algorithm
to perform the subband decomposition by mimicking the processing performed by the human ear. The unknown source signals are
separated by maximizing the entropy of a transformed set of signal mixtures through the use of a gradient ascent algorithm.
Experimental results show the efficiency of the proposed approach in terms of signal-to-interference ratio (SIR) and perceptual
evaluation of speech quality (PESQ) criteria. Compared to the fullband method that uses the Infomax algorithm and to the convolutive
fast independent component analysis (C-FICA), our method achieves a better PESQ score and shows an important improvement of
SIR for different locations of sensor inputs.
KeywordsBlind speech separation-Subband decomposition-Ear model-Infomax-Fast-iCA
06/2011: pages 161-172;
ABSTRACT: In this paper, we developed soft computing models for on-line automatic speech recognition (ASR) based on Bayesian on-line
inference techniques.Bayesian on-line inference for change point detection (BOCPD) is tested for on-line environmental learning
using highly non-stationary noisy speech samples from the Aurora2 speech database. Significant improvement in predicting and
adapting to new acoustic conditions is obtained for highly non-stationary noises. The simulation results show that the Bayesian
on-line inference-based soft computing approach would be one of the possible solutions to on-line ASR for real-time applications.
03/2011: pages 445-452;
ABSTRACT: In this paper we propose an algorithm for estimating noise in highly non-stationary noisy environments, which is a challenging
problem in speech enhancement. This method is based on minima-controlled recursive averaging (MCRA) whereby an accurate, robust
and efficient noise power spectrum estimation is demonstrated. We propose a two-stage technique to prevent the appearance
of musical noise after enhancement. This algorithm filters the noisy speech to achieve a robust signal with minimum distortion
in the first stage. Subsequently, it estimates the residual noise using MCRA and removes it with spectral subtraction. The
proposed Filtered MCRA (FMCRA) performance is evaluated using objective tests on the Aurora database under various noisy environments.
These measures indicate the higher output SNR and lower output residual noise and distortion.
Keywordsspeech enhancement-noise estimation-musical noise-spectral subtraction
05/2010: pages 72-81;
ABSTRACT: In this paper we report the results of a comparative study on blind speech signal separation approaches. Three algorithms,
Oriented Principal Component Analysis (OPCA), High Order Statistics (HOS), and Fast Independent Component Analysis (Fast-ICA),
are objectively compared in terms of signal-to-interference ratio criteria. The results of experiments carried out using the
TIMIT and AURORA speech databases show that OPCA outperforms the other techniques. It turns out that OPCA can be used for
blindly separating temporal signals from their linear mixtures without need for a pre-whitening step.
KeywordsBlind source separation-speech signals-second-order statistics-Oriented Principal Component Analysis
05/2010: pages 117-124;
ABSTRACT: This paper presents a hybrid technique combining the Karhonen-Loeve Transform (KLT), the Multilayer Perceptron (MLP) and Genetic
Algorithms (GAs) to obtain less-variant Mel-frequency parameters. The advantages of such an approach are that the robustness
can be reached without modifying the recognition system, and that neither assumption nor estimation of the noise are required.
To evaluate the effectiveness of the proposed approach, an extensive set of continuous speech recognition experiments are
carried out by using the NTIMIT telephone speech database. The results show that the proposed approach outperforms the baseline
and conventional systems.
12/2007: pages 169-178;
ABSTRACT: A novel method for spectral-domain fundamental frequency (F0) estimation is proposed. The basis of this method is estimating F0 using the power spectrum of a windowed speech segment. For this purpose, a new transform is introduced. The prominent feature of this transform is that it estimates F0 from the speech segment power spectrum by exploiting the window function power spectrum. As a result, this transform is named the Window-Based transform. By comparison between the proposed method and the autocorrelation and the cepstral pitch estimation methods, the superiority of the proposed method under noisy environments is demonstrated.
Université du Québec à Montréal
Institut national de la recherche scientifique