PosterPDF Available

Accessible Broadcast Audio Personalisation for Hard of Hearing Listeners - Poster



Doctoral Consortium Poster
14 native English speakers.
Hearing abilities varied from normal hearing
with tinnitus and bilateral mild loss through to
unilateral profound loss.
To better understand
Hard of Hearing
listeners’ requirements
for accessible
broadcast audio
Figure 2: Mean word recognition rate for the normal
hearing cohort (n=24) , shown with standard error bars.
[1] Action on Hearing Loss. (2015) Hearing Matters Report.
[2] Royal National Institute for Deaf People (RNID), Annual survey report 2008,” 2008
[3] Shirley, et. al. , “Personalized Object-Based Audio for Hearing Impaired TV Viewers”, JAES, Vol. 65, no. 4, 2017
[4] M. Armstrong, “From Clean Audio to Object Based Broadcasting”, BBC R&D White Paper, WHP 324, Sept. 2016
[5] N. Miller, “Measuring up to speech intelligibility, Int. J Lang Commun Disord, vol. 48, no. 6, 2013.
[6] D. Kalikow, et. al., “Development of a test of speech intelligibility in noise using sentence materials with controlled
word predictability,JASA, vol. 61, no. 5, pp. 13371351, 1977.
Twitter @thepengineer
Accessible Broadcast Audio Personalisation
for Hard of Hearing Listeners
Lauren Ward
Supervisors: Dr. Ben Shirley and Prof. William J Davies
Acoustics Research Centre, University of Salford, Manchester, UK
The variability in the Hard of Hearing listener results highlight
that any ‘one size fits all’ solution will not be effective.
Current Work
Future Work
11 Million
Number of people in the U.K. with Hearing
Impairment [1]
Percentage of Hard of Hearing viewers who report
issues understanding speech on TV [2]
Hearing Loss Broadcast Audio
Can offer an improved broadcast experience for Hard
of Hearing Listeners [3]
Object-Based Broadcast
Presents technological basis for personalisation of
broadcast content [4]
What is the
relationship between
non-speech audio
objects and speech
Does it differ for
Hard of Hearing
Research Question 2
How can the relationship
between non-speech audio
objects and speech be used
intelligently at point-of-
How could this aid
personalisation of
Hard of Hearing Listeners
Methodology “He killed the dragon with
his sword
“Mary should think about
the sword
Four Experimental Conditions- Keyword for recognition in bold
Condition LP Low Predictability Condition HP – High Predictability
Sentence as above with
SFX of slashing sword
Condition LP + SFX Condition HP + SFX
Sentence as above with
SFX of slashing sword
Adapted the Revised Speech Perception in Noise test [6], which evaluates how
increasing the predictability of speech affects keyword recognition in multi-talker
babble, by adding relevant acoustic context were to 50% of the stimuli.
In complex listening scenarios non-speech cues, such as context, language
structure and gestures, aid speech intelligibility [5].
Early approaches to creating clean audio’ solutions for Hard of Hearing listeners
have taken a binary approach, speech vs. non-speech, suppressing all non-speech
elements without consideration of the narrative role they may play.
Research into personalisable object-based audio for Hard of Hearing listeners,
which grouped audio objects like SFX into stems with adjustable volume, showed
improvement in the perceived clarity of speech [3].
Used a subset of stimuli.
Calibrated the signal to babble ratio for each listener.
24 native English speakers.
Signal to Babble ratio was set to -2dB.
Research Question 1
Figure 3: Box plot of mean
improvement in word
recognition rate for the Hard of
Hearing cohort (n=14)
Prior Work
To evaluate the effects of salient SFX on speech intelligibility in multi-talker babble
noise for normal and Hard of Hearing cohorts.
Normal Hearing Listeners
Increasing predictability
improves intelligibility by
73.5% [p<0.001].
Adding acoustic cues
improves intelligibility by
69.5% [p<0.001], similar to
increasing predictability.
Both cues work together
to further improve
intelligibility by 18.7%
[p<0.001], compared with
acoustic cues only.
Key Results
Key Results
Acoustic cues improved intelligibility
for half the participants.
Of those, 4 participants had found
greater than 20% improvement.
Intelligent Personalisation
In answering Research Question 2, methods will be developed for:
Calibrating and creating individual user requirement profiles.
Integration of these profiles into existing intelligibility metrics, to
adapt the level of different audio objects, such as sound effects, to
listener requirements.
These will be evaluated using perceptual listening studies with audio-
visual broadcast content.
The negative impact that acoustics cues had on intelligibility for some Hard of
Hearing listeners may have been caused by masking or cognitive overload effects.
Further investigation will endeavour to determine which of these possible effects is
the most prominent.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Age demographics have led to an increase in the proportion of the population suffering from some form of hearing loss. The introduction of object-based audio to television broadcast has the potential to improve the viewing experience for millions of hearing impaired people. Personalization of object-based audio can assist in overcoming difficulties in understanding speech and understanding the narrative of broadcast media. The research presented here documents a Multi-Dimensional Audio (MDA) implementation of object-based clean audio to present independent object streams based on object category elicitation. Evaluations were carried out with hearing impaired people and participants were able to personalize audio levels independently for four object-categories using an on-screen menu: speech, music, background effects, and foreground effects related to on-screen events. Results show considerable preference variation across subjects but indicate that expanding object-category personalization beyond a binary speech/non-speech categorization can substantially improve the viewing experience for some hearing impaired people.
Improvement or maintenance of speech intelligibility is a central aim in a whole range of conditions in speech-language therapy, both developmental and acquired. Best clinical practice and pursuance of the evidence base for interventions would suggest measurement of intelligibility form a vital role in clinical decision making and monitoring. However, what should be measured to gauge intelligibility and how this is achieved and relates to clinical planning continues to be a topic of debate. This paper considers strengths and weaknesses of selected clinical approaches to intelligibility assessment, stressing the importance of explanatory, diagnostic testing as both a more sensitive and clinically informative method. The worth of this, and any approach, is predicated, though, on awareness and control of key design, elicitation, transcription and listening/listener variables to maximise validity and reliability of assessments. These are discussed. A distinction is drawn between signal dependent and independent factors in intelligibility evaluation. Discussion broaches how these different perspectives might be reconciled to deliver comprehensive insights into intelligibility levels and their clinical/educational significance. The paper ends with a call for wider implementation of best practice around intelligibility assessment.
This paper describes a test of everyday speech reception, in which a listener's utilization of the linguistic-situational information of speech is assessed, and is compared with the utilization of acoustic-phonetic information. The test items are sentences which are presented in babble-type noise, and the listener response is the final word in the sentence (the key word) which is always a monosyllabic noun. Two types of sentences are used: high-predictability items for which the key word is somewhat predictable from the context, and low-predictability items for which the final word cannot be predicted from the context. Both types are included in several 50-item forms of the test, which are balanced for intelligibility, key-word familiarity and predictability, phonetic content, and length. Performance of normally hearing listeners for various signal-to-noise ratios shows significantly different functions for low- and high-predictability items. The potential applications of this test, particularly in the assessment of speech reception in the hearing impaired, are discussed.
Annual survey report
Royal National Institute for Deaf People (RNID), " Annual survey report 2008, " 2008
From Clean Audio to Object Based Broadcasting
  • M Armstrong
M. Armstrong, " From Clean Audio to Object Based Broadcasting ", BBC R&D White Paper, WHP 324, Sept. 2016
Royal National Institute for Deaf People (RNID)
Royal National Institute for Deaf People (RNID), "Annual survey report 2008," 2008