PosterPDF Available

Brain mechanisms involved in the use of visual speech to compensate for acoustically degraded speech

Authors:
Brain mechanisms involved in the use of visual speech to
compensate for acoustically degraded speech
Heidi S. Økland, Lucy MacGregor, Saskia Helbling, Helen Blank & Matt Davis
MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, United Kingdom
Trial structure:
Task: Attend to both video and audio at all times. Repeat the sentence as accurately as possible.
Methods: MEG/EEG study
Coherence with speech envelope (AV + AO): Coherence with lip aperture (AV + VO):
Results: MEG coherence (2-8 Hz, gradiometers)
1. At which sensors/source locations and
frequencies do we observe above-chance auditory
and visual speech-to-brain coherence?
2. Is visual enhancement (AV vs. AO) supported by
increased coherence with auditory and/or visual
speech cues in
.. auditory and/or visual cortex?
..“higher-level” areas like pSTS/motor cortex?
3. Individual differences:
Do coherence effects in 2. above correlate with visual
enhancement (AV vs. AO) and/or lip-reading (VO) as
measured by word report?
Do these behavioural effects also correlate with other
neural structures or cognitive abilities (e.g. MRI
white/grey matter, digit span, verbal IQ)?
Analysis: Next steps
Seeing the face of the speaker aids speech
comprehension when the acoustic speech signal
is degraded. This visual enhancement is:
well-known, but neural mechanisms remain
poorly understood
highly variable: some people benefit more
from visual speech cues than others
We address both these points and ask:
How does the coordination of neural
oscillations with auditory and visual speech
cues support comprehension of acoustically
degraded speech in different people?
Introduction
Native British English speakers with normal
hearing (N=11) watched/listened to video
recordings of sentences in three conditions:
Audio-Visual (AV)
Auditory-Only (AO)
Visual-Only (VO)
The 6 levels of clarity were achieved by making
linear combinations of 1 channel and 16 channel
noise vocoded speech from 0 to 100% in steps
of 20% (e.g. 20% 16ch + 80% 1ch).
Task: Attend to both video and audio at all times.
Repeat the sentence as accurately as possible.
Participants’ spoken responses were transcribed
and scored for number of words matching the
target sentence (word report):
Behavioural study
Sentence (~5s) 1-1.3s 0.5-0.65s 0.5s Word report
Speech envelope Lip aperture
"His curiosity about the world led him to pursue a career as a scientist"
6 levels of
acoustic clarity
Error bars:
SEM
Results: Word report
AV
med
Design
Low
clarity
Medium
clarity
N/A
AV
+
speech envelope
+ lip aperture
55 trials
55 trials
-
AO
+ speech envelope
lip aperture
55 trials
55 trials
-
VO
speech envelope
+ lip aperture
- -
55 trials
Participants: 14 British English native speakers
with normal hearing
Analysis
h
We quantify coordination of neural oscillations with
speech cues as coherence: the degree to which the
phase relationships of two signals are consistent.
h
We extract the speech envelopes and lip aperture
timecourses for all sentence videos and calculate
the coherence between these speech cues and
the MEG recording.
0%
50%
100%
020 40 60 80 100
Word report
Acoustic clarity (proportion 16ch vocoded speech)
AO AV VO
N=7
Conditions in
dashed rectangles
were selected for
M+EEG study.
Error bars: SEM.
Auditory-Only Audio-Visual Visual-Only
AV
low
AO
med
AO
low VO
Visual
enhancement
E-mail: Heidi.SolbergOkland@mrc-cbu.cam.ac.uk
MEG/EEG setup (Elekta)
S05
S01 S04
S06 S05 S06
S01 S04
0.0
0.1
0.2
S04
0.0
0.1
0.2
S04
0 %
50 %
100 %
S01
S04
S05
S06
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.