Publications (6)0 Total impact
-
Chapter: Convolutive Blind Separation of Speech Mixtures Using Auditory-Based Subband Model
[show abstract] [hide abstract]
ABSTRACT: A new blind speech separation (BSS) method of convolutive mixtures is presented. This method uses a sample-by-sample algorithm to perform the subband decomposition by mimicking the processing performed by the human ear. The unknown source signals are separated by maximizing the entropy of a transformed set of signal mixtures through the use of a gradient ascent algorithm. Experimental results show the efficiency of the proposed approach in terms of signal-to-interference ratio (SIR) and perceptual evaluation of speech quality (PESQ) criteria. Compared to the fullband method that uses the Infomax algorithm and to the convolutive fast independent component analysis (C-FICA), our method achieves a better PESQ score and shows an important improvement of SIR for different locations of sensor inputs. KeywordsBlind speech separation-Subband decomposition-Ear model-Infomax-Fast-iCA06/2011: pages 161-172; -
Chapter: Real-Time Bayesian Inference: A Soft Computing Approach to Environmental Learning for On-Line Robust Automatic Speech Recognition
[show abstract] [hide abstract]
ABSTRACT: In this paper, we developed soft computing models for on-line automatic speech recognition (ASR) based on Bayesian on-line inference techniques.Bayesian on-line inference for change point detection (BOCPD) is tested for on-line environmental learning using highly non-stationary noisy speech samples from the Aurora2 speech database. Significant improvement in predicting and adapting to new acoustic conditions is obtained for highly non-stationary noises. The simulation results show that the Bayesian on-line inference-based soft computing approach would be one of the possible solutions to on-line ASR for real-time applications.03/2011: pages 445-452; -
Chapter: Robust Speech Enhancement Using Two-Stage Filtered Minima Controlled Recursive Averaging
[show abstract] [hide abstract]
ABSTRACT: In this paper we propose an algorithm for estimating noise in highly non-stationary noisy environments, which is a challenging problem in speech enhancement. This method is based on minima-controlled recursive averaging (MCRA) whereby an accurate, robust and efficient noise power spectrum estimation is demonstrated. We propose a two-stage technique to prevent the appearance of musical noise after enhancement. This algorithm filters the noisy speech to achieve a robust signal with minimum distortion in the first stage. Subsequently, it estimates the residual noise using MCRA and removes it with spectral subtraction. The proposed Filtered MCRA (FMCRA) performance is evaluated using objective tests on the Aurora database under various noisy environments. These measures indicate the higher output SNR and lower output residual noise and distortion. Keywordsspeech enhancement-noise estimation-musical noise-spectral subtraction05/2010: pages 72-81; -
Chapter: A Comparative Study of Blind Speech Separation Using Subspace Methods and Higher Order Statistics
[show abstract] [hide abstract]
ABSTRACT: In this paper we report the results of a comparative study on blind speech signal separation approaches. Three algorithms, Oriented Principal Component Analysis (OPCA), High Order Statistics (HOS), and Fast Independent Component Analysis (Fast-ICA), are objectively compared in terms of signal-to-interference ratio criteria. The results of experiments carried out using the TIMIT and AURORA speech databases show that OPCA outperforms the other techniques. It turns out that OPCA can be used for blindly separating temporal signals from their linear mixtures without need for a pre-whitening step. KeywordsBlind source separation-speech signals-second-order statistics-Oriented Principal Component Analysis05/2010: pages 117-124; -
Chapter: A Hybrid Genetic-Neural Front-End Extension for Robust Speech Recognition over Telephone Lines
[show abstract] [hide abstract]
ABSTRACT: This paper presents a hybrid technique combining the Karhonen-Loeve Transform (KLT), the Multilayer Perceptron (MLP) and Genetic Algorithms (GAs) to obtain less-variant Mel-frequency parameters. The advantages of such an approach are that the robustness can be reached without modifying the recognition system, and that neither assumption nor estimation of the noise are required. To evaluate the effectiveness of the proposed approach, an extensive set of continuous speech recognition experiments are carried out by using the NTIMIT telephone speech database. The results show that the proposed approach outperforms the baseline and conventional systems.12/2007: pages 169-178; -
Article: A Method Utilizing Window Function Frequency Characteristics for Noise-Robust Spectral Pitch Estimation
[show abstract] [hide abstract]
ABSTRACT: A novel method for spectral-domain fundamental frequency (F0) estimation is proposed. The basis of this method is estimating F0 using the power spectrum of a windowed speech segment. For this purpose, a new transform is introduced. The prominent feature of this transform is that it estimates F0 from the speech segment power spectrum by exploiting the window function power spectrum. As a result, this transform is named the Window-Based transform. By comparison between the proposed method and the autocorrelation and the cepstral pitch estimation methods, the superiority of the proposed method under noisy environments is demonstrated.
Institutions
-
2011
-
Université du Québec à Montréal
Montréal, Quebec, Canada
-
-
2010
-
Institut national de la recherche scientifique
Québec, Quebec, Canada
-