Conference PaperPDF Available

Evaluating a New Algorithm for Multi-Talker Babble Noise Reduction Using Q-Factor Based Signal Decomposition

Authors:

Abstract

Evaluating a New Algorithm for Multi-Talker Babble Noise Reduction Using Resonance Based Signal Decomposition. Roozbeh Soleymani, Ivan W. Selesnick, David M. Landsberger Background: One of the key challenges for cochlear implant (CI) users is understanding speech in background noise. Previously, many different single-channel noise reduction algorithms have been introduced to address this issue. Typical algorithms have included applying a gain to the noisy envelopes, pause detection/spectral subtraction, and feature extraction and splitting the spectogram into noise and speech dominated tiles. However, even with these algorithms, speech understanding in the presence of competing talker (i.e. speech babble noise) remains difficult and additional artifacts are often introduced. Algorithm: We have developed a new two-stage algorithm with the goal of improving intelligibility of speech in the presence of background noise (including multi-talker babble). The first stage uses sparsity based signal decomposition with two high and low Q factor wavelet transforms and solving the basis pursuit de-noising problem as the optimization method. The product of the first stage is a signal split into low (LR) and high (HR) resonance components. The second stage involves temporal and spectral cleaning of the signal using the information obtained from the two components derived in the first stage. Methods: The algorithm was evaluated by measuring subjects understanding of IEEE standard sentences with and without processing by the algorithm. Sentences were presented against a background of 4-talker babble using four different signal to noise ratios (0, 3, 6, or 9 dB). Two randomly selected sentence sets (20 sentences) were presented for each of the 8 conditions (two processing conditions and 4 SNRs). The percentage of correct words in sentences was recorded. Prior to testing, subjects practiced with 20 processed sentences. After completing the speech understanding test, subjects were asked to evaluate the sound quality of the sentences using a MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor) scaling test. Normal hearing subjects will be tested using a noise-vocoded simulation while CI users will be tested with un-vocoded stimuli. Results: Preliminary results have been collected with 4 NH subjects. For all subjects, intelligibility and quality improved. While the improvement varied across subject and SNR, speech intelligibility improved between 10% to 20% while sound quality improved between 20% to 30%. Conclusions: The new algorithm improves both speech understanding and sound quality for speech in multi-talker babble. Careful considerations must be taken to implement the strategy in real time if the strategy is to be clinically implemented. Funding: NIH grant R01-DC12152
Quiet =87.5%
Quiet =98.3%
Quiet =58.5%
Quiet =91.9%
Quiet=63.9%
Quiet =89.3%
SNR(dB)
0 3 6 9
100
0 3 6 9 0 3 6 9
80
60
40
20
0
100
80
60
40
20
0
% Words Correct
Quiet
Processed
Unprocessed
Performance In Quiet
C110
C105 C113
C107
C118
C120
0
20
40
60
80
100
C106
Quiet=76.22%
SNR(dB)
0 3 6 9
100
0 3 6 9 0 3 6 9
80
60
40
20
0
100
80
60
40
20
0
%MUshra Sound Quality Score
C110 C105 C113
C107
C118
C120
0
20
40
60
80
100
C106
Processed
Unprocessed
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
-1
-0.5
0
0.5
1signal
time(sec)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
-1
-0.5
0
0.5
1signal(After Hard Thresholding)
time(sec)
Roozbeh Soleymani, Ivan W. Selesnick, Natalia Stupak, David M. Landsberger
Evaluating a New Algorithm for Multi-Talker Babble Noise Reduction Using Q-Factor Based Signal Decomposition
Mean Speech Quality in babble noise (CI)
SNR
0 3 6 9
%MUshra Sound Quality Score
0
20
40
60
80
100
Unprocessed
Processed
Overview:
Performance with a cochlear implant (CI) in noise is generally poor.
Previous noise reduction techniques tend to work poorly in non-stationary noise (e.g., multi-talker babble).
We propose a new algorithm to improve speech in multi-talker babble that could be implemented into a speech
processor.
Results demonstrate a consistent and significant benefit in intelligibility and sound quality.
Algorithm summary (See Figure 3):
Detect signal to noise ratio to determine how aggressive the de-noising will be.
Decompose signal into three components: a low Q-factor (LQF), high Q-factor (HQF), and residual noise
components using a sparse optimization wavelet method. (see “What is the Q-factor?”)
The low Q-factor component is used as a template to further de-noise the high Q-factor component.
The de-noised high Q-factor signal is added to the low Q-factor component to create the de-noised output.
What is the Q-factor?
The Q-factor of a pulse is defined as the ratio of its center frequency to its bandwidth:
.
A pulse with a high Q-factor (HQF) exhibits more sustained oscillatory behavior.
A pulse with a low Q-factor (LQF) exhibits less sustained oscillatory behavior.


  

  
Assuming
modified version of where the TSP of
is similar to TSP of , then we can write:
   
 
 
 
 
 
The clean speech signaland hence its high Q factor component () are not available. We Modify so
that : Therefore we have :   
  
 
 which is the
goal of this stage so we have:
 
 
, denote very high energy, high energy, low energy and very low energy.

: 


, 
: 


,
We propose an algorithm which performs point-wise multiplication in the Fourier transform domain of the
non-overlapping frames of the incoming signal.
First we define an incoming frame with length . :
Now applying discrete Fourier transform (DFT) to both sides we have
 



 


Now each point in and
should be categorized as one of the following:










































where :  .

, 
Evaluation Methods:
Subjects:
7 Advanced Bionics Fidelity 120 or Optima users with Medium Clear Voice.
Stimuli:
IEEE sentences in multi-talker babble
SNR 0, 3, 6, or 9dB, Processed or Unprocessed
Evaluating Intelligibility:
Words correct were measured for 20 sentences for each of the 8 conditions (two
processing conditions and 4 SNRs) in a randomized order.
Evaluating Quality:
A MUshra test was used to determine sound quality for all 8 conditions relative to a
speech in quiet reference. The low quality anchor was 6-talker babble noise without
speech. The process was repeated for 5 sentences.
Results: For all subjects, intelligibility and quality improved. Intelligibility improves
10 to 30% in 6-talker babble. Sound Quality improves between 15 and 30 points.
Conclusions: The new algorithm might greatly improve performance in realistic noisy
environments (i.e. a cocktail party). We are working on implementing a real time
implementation of the new algorithm.
SNR Estimation
Signal Decomposition and initial de-noising
Spectral cleaning and re-composition Results
Figure 4. Speech in
noise intelligibility
test mean results
Mean Speech Inteligibility in babble(CI)
SNR
0 3 6 9
% Words Correct
0
20
40
60
80
Unprocessed
Processed
Intelligibility Sound Quality
Average Intelligibility Average Quality
Improvement
found at ALL
SNRs
Consistent
improvement for
ALL subjects
Noise Free (Clean) Speech
Noise Free Speech HQF Component
Noise Free Speech LQF Component
Noisy Speech LQF Component
Output Processed Speech
Noisy Speech LQF Component Noisy Speech Cleaned HQF
Noisy Speech HQF Component Noisy Speech
- Time axis (Horizontal) : 0 to 2 (sec)
- Added noise is 6-Talker babble, SNR=0dB
- Each graph number corresponds to a
block in block diagram Figure 3.
Figure 1. Waveforms of the noise free,
noisy and de-noised signals and their
LQF, HQF components.
0 1 2 0 1 2 0 1 2
0 1 2
Time (Sec.)
Time (Sec.) Time (Sec.) Time (Sec.)
HQF : High Q-Factor
LQF : Low Q-Factor
- Time axis (Horizontal) : 0 to 2 (sec)
- Freq. axis (Vertical) : 0 to 6000Hz
- Added noise is 6-Talker babble, SNR=0dB
Noise Free (Clean) Speech
Noise Free Speech HQF Component
Noise Free Speech LQF Component
Noisy Speech Residual Component Noisy Speech LQF Component
Output Processed Speech
Noisy Speech LQF Component Noisy Speech Cleaned HQF
Noisy Speech HQF Component Noisy Speech
- Each graph number corresponds to a
block in block diagram Figure 3.
Figure 2. Spectrograms of the noise
free, noisy and de-noised signals and
their LQF, HQF components.
0 1 2 0 1 2 0 1 2
0 1 2
Time (Sec.)
Time (Sec.) Time (Sec.) Time (Sec.)
HQF : High Q-Factor
LQF : Low Q-Factor
INPUT SNR
ESTIMATION
SNR<5dB
SNR>12dB
5dB<SNR<12dB
WAVELET SETTING 1 Q FACTOR BASED
SIGNAL
DECOMPOSITION
SPECTRAL
CLEANING
PATTERN
SELECTION
HQF SPECTRAL CLEANING
WAVELET SETTING 2
RES
LQF
HQF
OUTPUT
Each number in the block diagram corresponds to a plot in figures 1 and 2
Figure 3. System Block Diagram

 

 
Analysis Filter Bank




Synthesis Filter Bank
 
 
 
 
 
  



 
 
 
TQWT


Performing:
Decomposition
Sparsification
and De-noising
Parameters











 


such that:

 ,
  


 

analysis

ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.