A preview of this full-text is provided by Springer Nature.
Content available from Neural Computing and Applications
This content is subject to copyright. Terms and conditions apply.
ORIGINAL ARTICLE
Speech dereverberation and source separation using DNN-WPE
and LWPR-PCA
Jasmine J. C. Sheeja
1
•B. Sankaragomathi
2
Received: 7 April 2021 / Accepted: 22 September 2022 / Published online: 8 January 2023
The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023
Abstract
Speech signals observed from distantly placed microphones may have some acoustic interference, such as noise and
reverberation. These may lead to the degradation of the quality of blind speech. Hence, it is necessary to process the
acquired speech signals to separate the blind source and eliminate the reverberation. Therefore, we proposed a novel speech
separation and dereverberation method, which is based on the incorporation of Locally Weighted Projection Regression
(LWPR)-based Principal Component Analysis (PCA) and Deep Neural Network (DNN)-based Weighted Prediction Error
(WPE). The proposed method preprocesses the mixed reverberant signal prior to the application of Blind Source Separation
(BSS) and Blind Dereverberation (BD). The preprocessing of the input sample signals is performed with the exploitation of
fast Fourier transform (FFT) and whitening approaches to convert the time domain signal into frequency domain signal and
to generate the transformation matrices. Besides, the utilization of LWPR-PCA can perform the BSS and the DNN-WPE
can be used to conduct the BD. Moreover, the experimental analysis of our proposed method is compared with the existing
RPCA-SNMF, CBF, BA-CNMF, AFMNMF, and ISC-LPKF approaches. The experimental outcomes depict that the
proposed method effectively separates the original signal from the mixed reverberant signals.
Keywords Locally Weighted Projection Regression (LWPR) Blind Source Separation (BSS) Dereverberation
Reverberation PCA Speech signals
1 Introduction
The sound signals recorded by exploiting microphones are
usually mixed with unwanted signals such as noise,
reverberation [1], and interferences [2]. Of this reverbera-
tion is the distraction that happens in the source signal
while transmitting from the source to the destination
through different paths with variations in length and
attenuations. Meanwhile, the signal with a 50-ms delay
(reverberation) is acceptable for human perception and
Automatic Speech Recognition (ASR). However, with
further delay, both signals are immediately distracted [1].
Besides the noise in the signal can be appended depending
upon the position of the microphones. The noise increases
with the distance between the speakers and the micro-
phones. Consequently, interference is the addition of
unwanted signals while moving from the source to the
destination.
To separate the source signal several methods have been
designed such as independent component analysis [3],
independent vector analysis [4], spatial clustering-based
time–frequency masking, and beamforming [5]. However,
Blind Source Separation (BSS) is carried out to reduce the
detrimental signals from the signals that are obtained from
the source. Moreover, it also separates the source signals
from the mixture without any prior knowledge. Besides, a
weighted prediction error minimization (WPE) [6] method
has been adopted by many researchers for the blind dere-
verberation technique.
In recent times, several methods are utilized by
researchers in order to jointly optimize the BSS and blind
dereverberation techniques [7]. Moreover, these methods
also exploit multi-input and multi-output (MIMO)
&Jasmine J. C. Sheeja
jasminejcsheeja@gmail.com
1
Department of ECE, Rohini College of Engineering and
Technology, Palkulam, Kanyakumari, India
2
Department of Biomedical Engineering, Sri Sakthi Institue of
Engineering and Technology, Coimbatore, India
123
Neural Computing and Applications (2023) 35:7339–7356
https://doi.org/10.1007/s00521-022-07884-0(0123456789().,-volV)(0123456789().,-volV)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.