Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 1 of 11
Parametric information about eye movements is sent to the ears
StephanieN.Lovicha,b,c,d,1,2 , CynthiaD.Kinga,b,c,d,1, DavidL.K.Murphya,c,d,1 , RachelE.Landruma,b,c,d, ChristopherA.Sherae,
and JenniferM.Groha,b,c,d,f,g,2
Edited by Marcus Raichle, Washington University in St. Louis School of Medicine, St. Louis, MO; received March 9, 2023; accepted September 28, 2023
RESEARCH ARTICLE
|
NEUROSCIENCE
Eye movements alter the relationship between the visual and auditory spatial scenes.
Signals related to eye movements affect neural pathways from the ear through auditory
cortex and beyond, but how these signals contribute to computing the locations of
sounds with respect to the visual scene is poorly understood. Here, we evaluated the
information contained in eye movement- related eardrum oscillations (EMREOs), pres-
sure changes recorded in the ear canal that occur in conjunction with simultaneous eye
movements. We show that EMREOs contain parametric information about horizontal
and vertical eye displacement as well as initial/final eye position with respect to the head.
e parametric information in the horizontal and vertical directions can be modeled as
combining linearly, allowing accurate prediction of the EMREOs associated with oblique
(diagonal) eye movements. Target location can also be inferred from the EMREO sig-
nals recorded during eye movements to those targets. We hypothesize that the (cur-
rently unknown) mechanism underlying EMREOs could impose a two- dimensional
eye- movement- related transfer function on any incoming sound, permitting subsequent
processing stages to compute the positions of sounds in relation to the visual scene.
otoacoustic emissions | saccades | reference frames | coordinate transformations |
sound localization
Every time we move our eyes to localize multisensory stimuli, our retinae move in relation
to our ears. ese movements shift the alignment of the visual scene (as detected by the
retinal surface) with respect to the auditory scene (as detected based on timing, intensity,
and frequency dierences in relation to the head and ears). Precise information about each
eye movement is therefore needed to connect the brain’s views of visual and auditory space
to one another (e.g., refs. 1–3). Most previous work about how eye movement information
is incorporated into auditory processing has focused on cortical and subcortical brain
structures (4–24), but the recent discovery of eye- movement- related eardrum oscillations
(EMREOs) (25–28) suggests that the process might manifest much earlier in the auditory
periphery. EMREOs can be thought of as a biomarker of underlying eerent information
impacting the internal structures of the ear in association with eye movements. What
information this eerent signal contains is currently uncertain.
We reasoned that if this eerent signal is to play a role in linking auditory and visual
space across eye movements, EMREOs should be parametrically related to the associated
eye movement. Specically, EMREOs should vary in a regular and predictable fashion
with both horizontal and vertical displacements of the eyes, and some form of information
regarding the initial position of the eyes should also be present. ese properties are
required if the eerent signal underlying EMREOs is to play a role in linking hearing and
vision. Notably, this parametric relationship is not required of alternative possible roles,
such as synchronizing visual and auditory processing in time or enhanced attentional
processing of sounds regardless of their spatial location (29–33).
Accordingly, we evaluated the parametric spatial properties of EMREOs in human
participants by varying the starting and ending positions of visually guided saccades in
two dimensions. We nd that EMREOs do in fact vary parametrically depending on the
saccade parameters in both horizontal and vertical dimensions and as a function of both
initial eye position in the orbits and the change in eye position relative to that initial
position. EMREOs associated with oblique (diagonal) saccades can be predicted by the
linear combination of the EMREOs associated with strictly horizontal and vertical sac-
cades. Furthermore, an estimate of target location can be decoded from EMREOs alone—
i.e., where subjects looked in space can be roughly determined from their observed
EMREOs.
ese ndings suggest that the eye movement information needed to accomplish a
coordinate transformation of incoming sounds into a visual reference frame is fully avail-
able in the most peripheral part of the auditory system. While the precise mechanism that
creates EMREOs remains unknown, we propose that the underlying mechanisms might
Significance
When the eyes move, the
alignment between the visual
and auditory scenes changes.
We are not perceptually aware of
these shifts—which indicates that
the brain must incorporate
accurate information about eye
movements into auditory and
visual processing. Here, we show
that the small sounds generated
within the ear by the brain
contain accurate information
about contemporaneous eye
movements in the spatial
domain: The direction and
amplitude of the eye movements
could be inferred from these
small sounds. The underlying
mechanism(s) likely involve(s) the
ear’s various motor structures
and could facilitate the
translation of incoming auditory
signals into a frame of reference
anchored to the direction of the
eyes and hence the visual scene.
Author contributions: S.N.L., C.D.K., D.L.K.M., and J.M.G.
designed research; S.N.L., C.D.K., R.E.L., and J.M.G.
performed research; S.N.L., C.D.K., D.L.K.M., C.A.S., and
J.M.G. contributed new reagents/analytic tools; S.N.L.,
C.D.K., D.L.K.M., and J.M.G. analyzed data; and S.N.L.,
C.D.K., D.L.K.M., and J.M.G. wrote the paper.
The authors declare no competing interest.
This article is a PNAS Direct Submission.
Copyright © 2023 the Author(s). Published by PNAS.
This open access article is distributed under Creative
Commons Attribution- NonCommercial- NoDerivatives
License 4.0 (CC BY- NC- ND).
1S.N.L., C.D.K., and D.L.K.M. contributed equally to this
work.
2To whom correspondence may be addressed. Email:
stephanie.schlebusch@duke.edu or jmgroh@duke.edu.
This article contains supporting information online at
https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.
2303562120/- /DCSupplemental.
Published November 21, 2023.
OPEN ACCESS
2 of 11 https://doi.org/10.1073/pnas.2303562120 pnas.org
introduce a transfer function to the sound transduction process
that serves to adjust the gain, latency, and/or spectral dependence
of responses in the cochlea. In principle, this could provide later
stages of auditory processing access to an eye- centered signal of
sound location for registration with the eye- centered visual scene
(1), Indeed, recent work has shown that changes in muscular
tension on the ossicular chain would be expected to aect gain
and latency of sound transmission through the middle ear, thus
supporting the plausibility of this hypothesis (34, 35).
Methods
We used earbud microphones to record internally generated oscillations in the ear
canals of human subjects with normal hearing and corrected to normal vision. All
procedures concerning human participants were approved by the Duke University
Institutional Review Board, and all participants provided informed consent before
beginning the experiments.
Participants performed eye movement tasks involving various visual fixation
and target configurations (SIAppendix, Fig.S1). No external sounds were pre-
sented in any task. At the beginning of each trial, subjects fixated on a visual fixa-
tion point for a minimum of 200 ms and then made a saccade to a second target,
which they then fixated on for another 200 ms (SIAppendix, Fig.S1A). Any trials
with micro- or corrective saccades during the 200 ms prior to or following main
fixation- point- to- target saccade were discarded, to ensure a stable baseline ear
canal recording could be established without intrusions by other eye movements.
Additional methodological details can be found in SIAppendix.
Results
We rst tested subjects (N = 10) on a task involving variation in
both initial xation position and target location along both hori-
zontal and vertical dimensions—the “ve- origin grid task”.
Subjects xated on an initial xation light located either straight
ahead, 9° left or right, or 6° up or down, and then made a saccade
to a target located within an array of possible target locations
spanning ±18° horizontally and ±12° vertically (Fig. 1, Inset and
SI Appendix, Fig. S1B). Results of this task are shown in Fig. 1.
Each panel shows the average microphone signal recorded in the
left ear canal (averaged across all subjects) associated with saccades
to a target at that location—e.g., the top right panel shows all
saccades to the top right target location. e color and line styles
of the waveforms correspond to the ve initial xation positions
from which the saccades could originate in space.
e rst overall observation from this gure is that the magnitude
of the waveform of the EMREO depends on both the horizontal
and vertical dimensions. In the horizontal dimension, EMREOs
are larger for more contralateral target locations: Compare the col-
umn on the right (contralateral) to the column on the left (ipsilat-
eral). e pattern is reversed for right ear canal recordings
(SI Appendix, Fig. S3). In the vertical dimension, EMREOs are
larger for higher vs lower targets in both left and right ears (compare
top row to bottom row in Fig. 1 and SI Appendix, Fig. S3).
e second overall observation from this gure is that the phase
of the EMREO waveform depends on the horizontal location of
the target with respect to the xation position. Specically, the
rst deection after saccade onset is a peak for the most ipsilateral
targets (left- most column) and trough for the most contralateral
targets (right- most column). However, where this pattern reverses
depends on the initial xation position. Specically, consider the
red vs blue traces in the middle column of the gure, which cor-
respond to targets along the vertical meridian. Red traces involve
saccades to these targets from the xation position on the right,
and thus involve leftward (ipsiversive) saccades. e red traces in
this column begin with a peak followed by a trough. In contrast,
the blue traces involve saccades to these targets from the xation
position on the left, i.e., rightward or contraversive saccades. e
blue traces begin with a trough followed by a peak. e pattern
is particularly evident in the central panel (see arrows).
e phase reversal as a function of the combination of target
location and initial eye position suggests that the EMREO wave-
forms might align better when plotted in an eye- centered frame
of reference. Fig. 2 demonstrates that this is indeed the case: the
data from Fig. 1 was replotted as a function of target location
relative to the initial xation position. e eight panels around
the center represent the traces for the subset of targets that can be
fully analyzed in an eye- centered frame, i.e., the targets immedi-
ately left, right, up, down, and diagonal relative to the ve xation
locations. By plotting the data based on the relative location of
the targets to the origins, the waveforms are better aligned, show-
ing no obvious phase reversals.
Although the waveforms are better aligned when plotted based
on target location relative to initial eye position, some variation
related to that xation position is still evident in the traces. at
is, in each panel, the EMREO waveforms with dierent colors/
line styles (corresponding to dierent xation positions) do not
necessarily superimpose perfectly. is suggests that a model that
incorporates both relative target position and original xation
position, in both horizontal and vertical dimensions, is needed to
account for the ndings. Furthermore, a statistical accounting of
these eects is needed. Accordingly, we t the data to the following
regression equation:
[1]
where H and V correspond to the initial horizontal and vertical
eye position and ΔH and ΔV correspond to the respective changes
in position associated with that trial. e slope coecients BH,
BΔH, BV, and BΔV are time- varying and reect the dependence of
the microphone signal on the respective eye position/movement
parameters. e term C(t) contributes a time- varying “constant”
independent of eye movement metric and can be thought of as
the best- tting average oscillation across all initial eye positions
and changes in eye position. We used the measured values of eye
position/change in eye position for this analysis rather than the
associated xation and target locations so as to incorporate trial-
by- trial variability in xation and saccade accuracy.
is model is a conservative one, assessing whether a linear
relationship between the microphone signal and the relevant
eye position/movement variables can provide a satisfactory t
to the data. As such, it provides a lower bound but does not
preclude that higher quality ts could be achieved via nonlinear
modeling. is approach is similar to the general linear models
applied to fMRI data (e.g., ref. 36) and diers chiey in that
we make no assumptions about the underlying temporal prole
of the signal (such as a hemodynamic response function) but
allow the temporal pattern to emerge in the time- varying ts
of the coecients.
Fig. 3 shows the average of these time- varying values of the slope
coecients across subjects (blue = left ear; red = right ear) and
provides information about the contribution of these various eye
movement parameters to the EMREO signal. A strong, consistent
dependence on horizontal eye displacement is observed, consistent
with our previous report (Fig. 3A) (25). is component is oscil-
latory and begins slightly before the onset of the eye movement,
inverting in phase for left vs right ears. e thickened parts of the
line indicate periods of time when this coecient diered signi-
cantly from 0 with 95% condence (shaded areas are ±SEM).
ere is also an oscillatory and binaurally phase- inverting signal
Mic(t)=BH(t)H+BΔH(t)ΔH+BV(t)V+BΔV(t)ΔV+C(t),
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 3 of 11
related to the initial position of the eyes in the horizontal dimen-
sion (Fig. 3B). is signal is smaller and more variable across
subjects.
In the vertical dimension, the eect of vertical saccade ampli-
tude is in phase for both the left and right ears; it exhibits an
oscillatory pattern, although not obviously sinusoidal like the
one observed for the horizontal saccade amplitude. Initial posi-
tion of the eyes in the vertical dimension exerts a variable eect
across participants such that it is not particularly evident in
this grand average analysis; this may be related to poorer abil-
ities to localize sounds in the vertical vs. horizontal dimensions
(37–40).
Finally, there is a constant term that is similar in the two ears
and peaks later with respect to saccade onset than is the case for
the other coecients (Fig. 3E). As noted above, this constant term
can be thought of as encapsulating the average EMREO waveform
-500 50 100
-0.5
0
0.5
Left ear: (-18 12)
-500 50 100
-0.5
0
0.5 (-18 6)
-500 50 100
-0.5
0
0.5 (-18 0)
-500 50 100
-0.5
0
0.5 (-18 -6)
-500 50 100
Time re: saccade onset (ms)
-0.5
0
0.5
Microphone (sd)
(-18 -12)
-500 50 100
-0.5
0
0.5
(-9 12)
-500 50 100
-0.5
0
0.5 (-9 6)
-500 50 100
-0.5
0
0.5 (-9 -6)
-500 50 100
-0.5
0
0.5 (-9 -12)
-500 50 100
-0.5
0
0.5
(0 12)
-500 50 100
-0.5
0
0.5 (0 6)
-500 50 100
-0.5
0
0.5 (0 0)
-500 50 100
-0.5
0
0.5 (0 -6)
-500 50 100
-0.5
0
0.5 (0 -12)
-500 50 100
-0.5
0
0.5
(9 12)
-500 50 100
-0.5
0
0.5 (9 6)
-500 50 100
-0.5
0
0.5 (9 0)
-500 50 100
-0.5
0
0.5 (9 -6)
-500 50 100
-0.5
0
0.5 (9 -12)
-500 50 100
-0.5
0
0.5
(18 12)
-500 50 100
-0.5
0
0.5 (18 6)
-500 50 100
-0.5
0
0.5 (18 0)
-500 50 100
-0.5
0
0.5 (18 -6)
-500 50 100
-0.5
0
0.5 (18 -12)
-500 50 100
-0.5
0
0.5 (-9 0)
left (-9,0)
center (0,0)
right (9,0)
up (0,6)
down (0,-6)
Origin (fixation) position:
-18º -9º0º+18º+9º
-12º
-6º
0º
+6º
+12º
Five origin grid task layout:
Fig.1. EMREOs recorded during the ve- origin grid task. Each panel shows the grand average EMREO signal generated when saccades were made to that
location on the screen (average of N = 10 subjects’ individual left ear averages). For example, the Top Right panel shows microphone recordings during saccades
to the top right (contralateral) target location, and the color and line styles of each trace in that panel correspond to saccades from dierent initial xation
points. e.g., the red traces originated from the rightward xation, the blue from the leftward xation etc., as indicated by the legend and boxes of the same
color and line style. Both magnitude and phase vary as a function of initial eye position and target location, with contralateral responses being larger than
ipsilateral. Phase reversal occurs based on the location of the target with respect to the initial xation position, as can be seen for the central target location
(Central), where the EMREOs evoked for saccades from the rightward xation (red traces) show an opposite phase relationship as those evoked for saccades
from the leftward xation (blue traces). Corresponding grand averages for right ear data are shown in SIAppendix, Fig.S3. These data are presented in Z- units;
the peak- equivalent sound levels for 18 degree horizontal targets are roughly 55 to 56 dB SPL; see SIAppendix, Fig.S2 for the mean and distributions across
the subject population (range ~49 to 64 dB SPL).
4 of 11 https://doi.org/10.1073/pnas.2303562120 pnas.org
that occurs when pooling across all the eye movements in the
dataset, regardless of their initial positions or horizontal or vertical
components.
We next investigated whether the ts obtained in one task
match those obtained in a dierent task. We reasoned that if the
information contained in the EMREO signal reects the eye
movement itself, then task context should not matter. Furthermore,
the regression model should provide a good way to accomplish
this comparison since it does not require that the exact same loca-
tions and eye movements be tested across tasks.
To test these questions, we collected data using two simplied
tasks, the single- origin- grid task (with a single initial xation in
the center, SI Appendix, Fig. S1C) and the horizontal/vertical task
(with a single xation at the center and targets on the horizontal
and vertical meridians, generating purely horizontal or vertical
saccades, SI Appendix, Fig. S1D). Ten subjects (four of whom also
completed the 5- origin grid task) completed both the single- origin
grid task and the horizontal/vertical saccade. We t the results
from these tasks using the same regression procedure but omitting
the initial xation position terms, i.e.:
[2]
As shown in Fig. 4, both tasks yield similar values of the regres-
sion coecients for horizontal change- in- position (B
ΔH
(t)) and the
constant term (C(t)) (grand average across the population, black
vs. green traces). e vertical change- in- position term (B
ΔV
(t)) was
slightly more variable but also quite consistent across tasks.
Given the consistency of the regression coecient values between
the single- origin grid and horizontal/vertical tasks (and see ref. 26)
for similar results involving spontaneous vs. task- guided eye move-
ments), we surmised that it should be possible to use the coecient
values from one task to predict the EMREO waveforms in the other.
Specically, we used the time- varying regression values from purely
horizontal and purely vertical saccades in the horizontal/vertical task
to predict the observed waveforms from oblique saccades in the
single origin grid task. is method can be used to evaluate the
quality of the regression- based EMREO prediction not only for
target locations tested in both tasks, i.e., the horizontal and vertical
meridians, but also for oblique targets tested only in the grid task.
e black traces in Fig. 5 show the grand average microphone
signals associated with each target in the single- origin grid task.
e location of each trace corresponds to the physical location of
the associated target in the grid task (similar to Fig. 1). e super-
imposed predicted waveforms (red traces) were generated from
Mic(t)=BΔH(t)ΔH+BΔV(t)ΔV+C(t).
-50 050 100
-0.4
-0.2
0
0.2
0.4 Left ear: (-9 6)
-50 050 100
-0.4
-0.2
0
0.2
0.4 (-9 0)
-50 050 100
Time re: saccade onset (ms)
-0.4
-0.2
0
0.2
0.4
Microphone (sd)
(-9 -6)
-50 050100
-0.4
-0.2
0
0.2
0.4 (0 6)
-50 050100
-0.4
-0.2
0
0.2
0.4 (0 -6)
-500 50 100
-0.4
-0.2
0
0.2
0.4 (9 6)
-500 50 100
-0.4
-0.2
0
0.2
0.4 (9 0)
-500 50 100
-0.4
-0.2
0
0.2
0.4 (9 -6)
left (-9,0)
center (0,0)
right (9,0)
up (0,6)
down (0,-6)
Fixation Positions
Eye-centered target locations
Fig.2. Replotting the grand average EMREOs as a function of relative target location shows better, but not perfect, correspondence of the EMREOs across dierent
xation positions. The data shown are a subset of those shown in Fig.1, but here each panel location corresponds to a particular target location dened relative to
the associated xation position. The color/linestyle indicates the associated relative xation position. For example, the waveforms in the upper right panel all involved
9° rightward and 6° upward saccades; the red trace in that panel indicates those that originated from the 9° right xation; the blue those from the 9° left xation etc.
Only relative target locations that existed for all 5 xation positions are plotted, as indicated by the inset. Corresponding right ear data are shown in SIAppendix, Fig.S4.
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 5 of 11
the BΔH(t), BΔV(t), and C(t) regression coecients t to only the
horizontal/vertical data and then evaluated at each target location
and moment in time to produce predicted curves for each of the
locations tested in the grid task.
Overall, there is good correspondence between the predicted
EMREO oscillations and the observed EMREO from actual
microphone recordings, including the oblique target locations that
were not tested in the horizontal/vertical task. is illustrates two
things: 1) e EMREO is reproducible across task contexts, and
2) the horizontal and vertical change- in- position contributions
interact in a reasonably independent way, so that the EMREO
signal observed for a combined horizontal- vertical saccade can be
predicted as the sum of the signals observed for purely horizontal
and purely vertical saccades with the corresponding component
amplitudes.
Given that it is possible to predict the microphone signal from
one task context to another, it should also be possible to decode
the target location associated with an eye movement from just the
simultaneously recorded microphone signal. To do this, we again
used the weights from the horizontal/vertical task data for the
regression equation (refer to equation 2 above).
[2]
Mic(t)=BΔH(t)ΔH+BΔV(t)ΔV+C(t).
-500 50 100
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
Coeff. (sd/deg)
Horizontal change-in-eye-position
BH(t)
-500 50 100
-0.01
-0.005
0
0.005
0.01
Coeff. (sd/deg)
-500 50 100
-0.01
-0.005
0
0.005
0.01
0.015
Horizontal initial eye position
BH(t)
-500 50 100
-0.01
-0.005
0
0.005
0.01
-500 50 100
Time re: saccade onset (ms)
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
Coeff. (sd)
Constant
C(t)
Right ears
Left ears
Grand average N=10
p<0.05
+/- SE ≠ 0
AB
E
Vertical change-in-eye-position
BV(t)
Vertical initial eye position
BV(t)
CD
Mic(t) = BH(t) H + BH(t)H + BV(t) V + BV(t)V + C(t
)
5-Origin Grid Task Regression Model
Approx saccade
offset times
Fig.3. Regression analysis of EMREOs shows contribu-
tions from multiple aspects of eye movement: horizontal
change- in- eye position (A), horizontal initial eye position
(B), and vertical change- in- eye- position (C). The contribu-
tion of vertical initial eye position was weaker (D). Finally,
the constant component showed a contribution that
was also consistent across saccades (E). The regression
involved modeling the microphone signal at each time
point, and each panel shows the time varying values of
the coecients associated with the dierent aspects of
the eye movement (horizontal vs. vertical, change- in-
position and initial position). The regressions were t to
individual subjects’ microphone recordings and plotted
here as grand averages of these regression coecients
across the N = 10 subjects tested in the 5- origin grid task.
Microphone signals were z- scored in reference to baseline
variability during a period −150 to 120 ms prior to saccade
onset. Results are presented in units of SD (panel E) or SD
per degree (panels A–D). Shaded areas represent ±SEM.
6 of 11 https://doi.org/10.1073/pnas.2303562120 pnas.org
Specically, we used the Mic(t) values observed in the single- origin
grid task to solve this system of multivariate linear equations
across the time window −5 to 70 ms with respect to the saccade
(a time period in which the EMREO appears particularly con-
sistent and substantial in magnitude) to generate the “read out”
values of ΔH and ΔV associated with each target’s actual ΔH and
ΔV. We conducted this analysis on the left ear and right ear data
separately. e left ear results of this analysis are seen in each of
the individual panels of Fig. 5; the black values (e.g., −18, 12)
indicate the actual horizontal and vertical locations of the target,
and the associated red values indicate the inferred location of the
target. Across the top of the gure, the numbers indicate the
average inferred horizontal location, and down the right side, the
numbers indicate the average inferred vertical location. ese
results indicate that, on average, the targets can be read out in the
proper order, but the spatial scale is compressed: e average
read- out values for the ±18 degree horizontal targets are ± ~11 to
12 degrees, and the averages for the vertical ±12 degree targets
are ± ~6 to 7 degrees. Similar patterns occurred for the right ear
data (SI Appendix, Fig. S6).
Plots of these target readouts in both horizontal and vertical
dimensions for both ears are shown in Fig. 6 A–F. Fig. 6A shows
the inferred location of the target (red dots) connected to the actual
location of the target (black dots) using the data from Fig. 5, i.e.,
the left ear readout, and Fig. 6 B and C shows regressions of these
target readouts as a function of the horizontal and vertical locations.
Fig. 6 D–F shows the corresponding results for the right ears.
Altogether, these gures illustrate that the readout accuracy is better
in the horizontal than in the vertical dimensions. Quantitatively,
the r2 values for the horizontal dimension were 0.89 (LE) and 0.91
(RE), and the corresponding values for the vertical dimension were
0.61 (LE) and 0.67 (RE). Slopes were also closer to a value of 1
(the ideal) for the horizontal dimension (0.71, LE; 0.77, RE) than
for the vertical dimension (0.51, LE, 0.51, RE).
Given that it is known that the brain uses binaural computa-
tions for reconstructing auditory space, we wondered whether the
accuracy of this read- out could be improved by combining signals
recorded in each ear simultaneously. We rst considered a binaural
dierence computation, subtracting the right ear microphone
recordings from the left, thus eliminating the part of the signal
that is common between the two ears. Fig. 6G shows the results.
Generally, the horizontal dimension is well ordered, whereas the
vertical dimension continues to show considerable shuing. is
can also be seen in Fig. 6 H and I, which show the relationship
between the inferred target location and the true target location,
plotted on the horizontal and vertical dimension, respectively. e
correlation between inferred and actual target is higher in the
horizontal dimension (r2 0.95) than the vertical dimension (r2
0.41), which is actually worse than the monaural readouts. is
makes sense because the binaural dierence computation serves
to diminish the contribution from aspects of the signal that are
in phase across the two ears, such as the dependence on vertical
change in eye position. We then reasoned that improvement in
the vertical readout could be achieved by instead averaging, rather
than subtracting, the signals across the two ears, and indeed this
is so: averaging across the two ears produces an improved vertical
readout (r2 0.73, Fig. 6K). Finally, a hybrid readout operation in
which the horizontal location is computed from the binaural dif-
ference, and the vertical location is computed from the binaural
average, produces an additional modest improvement, yielding
the best overall reconstruction of target location (Fig. 6J).
We next considered how well this readout operation performed
at the level of individual subjects. Results for each subject are
shown in SI Appendix, Fig. S7 A–J, and a population summary is
shown in SI Appendix, Fig. S7K. e relationship between the
inferred location and the actual location was statistically signicant
(P < 0.05) for all 10 subjects in the horizontal dimension, and for
7 of 10 subjects in the vertical dimension. is conrms that we
050 100
-0.04
-0.02
0
0.02
0.04
Horiz. change-in-position B
H
(t)
Left ear
A
050 100
Time re: saccade onset (ms)
-0.04
-0.02
0
0.02
0.04
Mic SD/deg
Right ear
D
050100
-0.04
-0.02
0
0.02
0.04
Vert. change-in-position B
V
(t)
B
050100
-0.04
-0.02
0
0.02
0.04
Mic SD/deg
E
050100
-0.4
-0.2
0
0.2
0.4
Constant C(t)
C
050100
-0.4
-0.2
0
0.2
0.4
Mic SD
F
single-origin grid task +/- SE
horizontal/ vertical task +/- SE
N=10
Fig.4. Dierent tasks generate similar regression coecient curves. Grand average of the regression results for the single- origin grid (black lines) and horizontal/
vertical (green lines) tasks. The horizontal change- in- position (A), the vertical change in position (B), and the constant component (C) are shown for the left ear.
The lines and shading represent the average and SE of the coecient values across the same 10 subjects for the two tasks. The same information is also shown
for the right ear (D– F).See SIAppendix, Fig.S5 for corresponding ndings among the individual subjects.
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 7 of 11
can predict the horizontal location of the target of a saccade from
ear recordings in each individual subject, and the vertical location
can be predicted for most, but not all, subjects.
Finally, we evaluated the error when reading out the target
location of individual trials. e preceding analyses show the
results when the readout operation is performed on the average
waveform observed across trials for a given target location. e
same readout can also be computed for each individual trial. When
conducted on all individual trials, the resulting scatter can be
computed as the average SD observed across target locations and
subjects. We found that the average scatter or SD was 19.1 degrees
in the horizontal dimension and 23.8 degrees in the vertical
dimension. For the horizontal dimension, this corresponds
roughly half of the range of space tested (±18 degrees), whereas
in the vertical dimension, this corresponds to nearly the whole
range (±12 degrees).
Overall, these results parallel human sound localization which
relies on a binaural dierence computation in the horizontal
dimension (and is more accurate in that dimension) vs. potentially
monaural or averaged spectral cues for the vertical dimension
(which is less accurate) (41, 42). Indeed, horizontal and vertical
sound localization show dierent patterns of dependence on the
loudness of the target sound relative to background noise, further
supporting that these operations are accomplished via distinct
mechanisms (43).
Discussion
Sound locations are inferred from head- centered dierences in
sound arrival time, intensity, and spectral content, but visual stim-
ulus locations are inferred from eye- centered retinal locations (41,
42). Information about eye movements with respect to the head/
ears is critical for connecting the visual and auditory scenes to one
another (1). is insight has motivated a number of previous
neurophysiological studies in various brain areas in monkeys and
cats, all of which showed that changes in eye position aected the
auditory response properties of at least some neurons in the brain
area studied [Inferior colliculus: (8–12); auditory cortex: (5–7);
superior colliculus: (18–24); frontal eye elds: (13, 44); intrapa-
rietal cortex: (14–17)].
ese ndings raised the question of where signals related to
eye movements rst appear in the auditory processing stream. e
discovery of EMREOs (25–28, 45) introduced the intriguing
possibility that the computational process leading to visual- auditory
050 100
Time re: saccade onset (ms)
-1
0
1
Mic signal (sd)
(-18,-12)(-9.2,-8)
N = 10 (LE)
(-18,-6)(-13.7,-10)
(-18,0) (-8.9,-5)
(-18,6) (-10.6,-7)
(-9,-12) (-7.0,-9)
(-9,-6) (-9.8,-5)
(-9,0) (-7.8,-3)
(-9,6) (-6.0,1)
(0,-12) (-2.4,-9)
(0,-6) (-4.1,-3)
(0,6) (0.6,5)
(9,-12) (7.1,-0)
(9,-6) (7.3,-7)
(9,0) (8.9,-4)
(9,6) (14.3,3)
(-18,12)(-12.0,6)
-10.9
(-9,12) (-4.3,5)
-7.0
(0,12) (0.4,6)
-1.4
(9,12) (13.6,8)
10.3
(18,-12)(7.5,-5)
-6.4
(18,-6) (10.8,1)
-4.9
(18,0) (11.2,-6)
-4.6
(18,6) (18.1,-2)
-0.0
(18,12) (15.1,9)
12.6
6.6
Average inferred
horizontal target position
actual inferred
Average inferred
vertical target positio
n
actual +/- SE
predicted
actual +/- SE
predicted
Actual and predicted wave forms by target location (grid task)
Fig.5. Regression coecients t to microphone recordings from the horizontal/vertical- saccade task can be used to predict the waveforms observed in the
grid task and their corresponding target locations. Combined results for all N = 10 participants’ left ears. The black traces indicate the grand average of all the
individual participants’ mean microphone signals during the single- origin grid task, with the shading indicating ± the SE across participants. The red traces show
an estimate of the EMREO at that target location based only on regression coecients measured from the horizontal/vertical task. Black values in parentheses
are the actual horizontal and vertical coordinates for each target in the grid task. Corresponding red values indicate the inferred target location based on solving
a multivariate regression which ts the observed grid task microphone signals in a time window (−5 to 70 ms with respect to saccade onset) to the observed
regression weights from the horizontal/vertical task for each target location. The averages of these values in the horizontal and vertical dimensions are shown
across the top and right sides. See Fig.6 for additional plots of the inferred vs actual target values and SIAppendix, Fig.S6 for corresponding right- ear data.
8 of 11 https://doi.org/10.1073/pnas.2303562120 pnas.org
integration might be manifested in the most peripheral part of
the auditory system. Here, we show that the signals present in the
ear exhibit the properties necessary for playing a role in this pro-
cess: ese signals carry information about the horizontal and
vertical components of eye movements and display signatures
related to both change- in- eye- position and the absolute position
of the eyes in the orbits. Because of the parametric information
present in the EMREO signal, we were able to predict EMREOs
in one task from the EMREOs recorded in another and even
predict the target of eye movements from the simultaneous
EMREO recording. ese predictions were accomplished using
strictly linear methods, a conservative approach providing a lower
bound on what can be deduced from these signals. Improvements
in the “readout” may come from exploration of more powerful
nonlinear techniques and/or other renements such as tailoring
the time window used for the readout (here, a generous −5 to 70
ms) or stricter criteria for the exclusion of trials contaminated by
noise (see “Methods: Trial exclusion criteria”). It should be noted
that this read- out presumes knowledge of when the saccade starts
and that performance would be substantially poorer if conducted
in a continuous fashion across time.
Our present observations raise two key questions: what causes
EMREOs and how do those mechanisms impact hearing/auditory
processing? e proximate cause of EMREOs is likely to be one
or more of the known types of motor elements in the ear*: the
middle ear muscles (stapedius and tensor tympani), which
modulate the motion of the ossicles (46–48), and the outer hair
cells, which modulate the motion of the basilar membrane (49).
One or more of these elements may be driven by descending brain
signals originating from within the oculomotor pathway and
entering the auditory pathway somewhere along the descending
stream that ultimately reaches the ear via the 5th (tensor tympani),
7th (stapedius muscle), and/or 8th nerves (outer hair cells) (see
refs. 50–55 for reviews). Eorts are currently underway in our
laboratory to identify the specic EMREO generators/modulators
(56–58).
Uncovering the underlying mechanism should shed light on
another question. Does the temporal pattern of the observed
EMREO signal reect the time course and nature of that under-
lying mechanism’s impact on auditory processing? It is not clear
how an oscillatory signal like the one observed here might con-
tribute to hearing. However, it is also not clear that the underlying
mechanism is, in fact, oscillatory. Microphones can only detect
signals with oscillatory energy in the range of sensitivity of the
microphone. It is possible that the observed oscillations reect
ringing associated with a change in some mechanical property of
the transduction system and that change could have a nonoscil-
latory temporal prole (Fig. 7A). Of particular interest would be
a ramp- to- step prole in which aspects of the middle or inner ear
shift from one state to another during the course of a saccade and
hold steady at the new state during the subsequent xation period.
is kind of temporal prole would match the time course of the
saccade itself.
Available eye movement control signals in the oculomotor sys-
tem include those that follow this ramp- and- hold temporal prole
or tonic activity that is proportional to eye position throughout
-20-10 01020
Horizontal (deg)
-20
-10
0
10
20
Vertical (deg)
Left ear: 2D map
N = 10
-20 -10 01020
Actual horizontal target location (deg)
-20
-10
0
10
20
Read-out target location (deg)
LE horizontal readout
r^2 = 0.891
p = 0.000
slope = 0.71
int = 0.80
B
-20 -10 01020
Actual vertical target location (deg)
-15
-10
-5
0
5
10
Read-out target location (deg)
LE vertical readout
r^2 = 0.605
p = 0.000
slope = 0.51
int = -1.74
-20-10 01020
Horizontal (deg)
-20
-10
0
10
20
Vertical (deg)
Right ear: 2D map
D
-20-10 01020
Actual horizontal target location (deg)
-20
-10
0
10
20
Read-out target location (deg)
RE horizontal readout
r^2 = 0.909
p = 0.000
slope = 0.77
int = -0.34
E
-20-10 01020
Actual vertical target location (deg)
-10
-5
0
5
10
15
Read-out target location (deg)
RE vertical readout
r^2 = 0.646
p = 0.000
slope = 0.51
int = -0.22
F
-20-10 01020
Horizontal (deg)
-20
-10
0
10
20
Vertical (deg)
Binaural difference (LE minus RE)
G
-20-10 01020
Actual horizontal target location (deg)
-20
-10
0
10
20
Read-out target location (deg)
Binaural difference horizontal readout
r^2 = 0.950
p = 0.000
slope = 0.73
int = 0.30
H
-20-10 01020
Actual vertical target location (deg)
-15
-10
-5
0
5
10
Read-out target location (deg)
Binaural difference vertical readout
r^2 = 0.410
p = 0.001
slope = 0.43
int = -0.38
I
-20-10 01020
Actual vertical target location (deg)
-20
-10
0
10
20
Read-out target location (deg)
Binaural average vertical readout
p = 0.000
int = -1.97
r^2 = 0.725
p = 0.000
slope = 1.02
int = -1.97
K
-20-10 01020
Horizontal (deg)
-20
-10
0
10
20
Vertical (deg)
Binaural difference (H), binaural average (V)
Actual
Read-out
C
AJ
Fig.6. Multiple ways of reading out target location from the ear canal recordings. As in Fig.5 and SIAppendix, Fig.S6, the relationship between EMREOs and
eye movements was quantitatively modelled using Eq. 2 and the ear canal data recorded in the horizontal/vertical task. Inferred grid task target location was
read out by solving Eq. 2 for ΔH and ΔV using the coecients as t from the horizontal/vertical task and the microphone values as observed in the single- origin
grid task; see main text for details. (A) Inferred target location (red) compared to actual target location (black), based on the left ear (same data as in Fig.5).
(B) Horizontal component of the read- out target vs the actual horizontal component (left ear microphone signals). (C) Same as (B) but for the vertical component. (D–F)
Same as A–C but for the right ear. (G–I) Same as (A–C) and (D–F) but computed using the binaural dierence between the microphone signals (left ear—right ear).
(J and K) A hybrid read- out model (J) using binaural dierence in the horizontal dimension (H) and binaural average in the vertical dimension (K). Related ndings
at the individual subject level are provided in SIAppendix, Fig.S7.
*We note that EMREOs are unlikely to be due to the actual sound of the eyes moving in the
orbits. Our original study, Gruters et al. (25) showed that when microphone recordings are
aligned on saccade oset (as opposed to onset, as we did here), EMREOs continue for at
least several 10’s of ms after the eyes have stopped moving.
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 9 of 11
periods of both movement and xation. In addition to such tonic
signals, oculomotor areas also contain neurons that exhibit burst
patterns or elevated discharge in association with the saccade itself,
as well as combinations of burst and tonic patterns (for reviews,
see refs. 59 and 60). It remains to be seen which of these signals
or signal combinations might be sent to the auditory periphery
and where they might come from. e paramedian pontine retic-
ular formation is a strong candidate for a source, having been
implicated in providing corollary discharge signals of eye move-
ments in visual experiments (61) (see also ref. 62) and containing
each of these basic temporal signal proles (59, 60). Regardless of
the source and nature of the descending corollary discharge signal,
the oscillations observed here should be thought of as possibly
constituting a biomarker for an underlying, currently unknown,
mechanism, rather than necessarily the eect itself.
Despite these critical unknowns, it is useful to articulate a work
-
ing conceptual model of how EMREOs might facilitate visual and
auditory integration (Fig. 7B). e general notion is that, by send-
ing a copy of each eye movement command to the motor elements
of the auditory periphery, the brain keeps the ear informed about
the current orientation of the eyes. If, as noted above, these
descending oculomotor signals cause a ramp- to- step change in
the state of tension of components within the EMREO pathway,
time- locked to the eye movement and lasting for the duration of
each xation period, they would eectively change the transduc-
tion mechanism in an eye position/eye movement–dependent
Saccade: ramp-and-hold
Time
Eye position
BWorking conceptual model
ATemporal profiles of relevant events and signals
Observed EMREO:
oscillatory
Time
Mic. signal
Time
Neural activity
Eye movement
command
Sound localization cues
modulated by eye
movement command
Synthesized
VA object
Candidate corollary discharge signals
Time
Neural activity
Time
Neural activity
Burst
Tonic
Burst-Tonic
Retinal
location
Copy of eye
movement command
?
Underlying
mechanism
Fig.7. Temporal proles of relevant signals and working conceptual model for how EMREOs might relate to our ability to link visual and auditory stimuli in
space. (A) Temporal proles of signals. The EMREO is oscillatory, whereas the eye movement to which it is synchronized involves a ramp- and- hold temporal
prole. Candidate source neural signals in the brain might exhibit a ramp- and- hold (tonic) pattern, suggesting a ramp- and- hold- like underlying eect on an
as- yet- unknown peripheral mechanism, or could derive from other known temporal proles including bursts of activity time- locked to saccades. (B) Working
conceptual model. The brain causes the eyes to move by sending a command to the eye muscles. Each eye movement shifts the location of visual stimuli on the
retinal surface. A copy, possibly a highly transformed one, of this eye movement command is sent to the ear, altering ear mechanics in some unknown way. When
a sound occurs, the ascending signal to the brain will depend on the combination of its location in head- centered space (based on the physical values of binaural
timing and level dierences and spectral cues) and aspects of recent eye movements and xation position. This hybrid signal could then be readout by the brain.
10 of 11 https://doi.org/10.1073/pnas.2303562120 pnas.org
fashion. In turn, these changes could aect the latency, gain, or
frequency- ltering properties of the response to sound. Indeed,
intriguing ndings from Puria et al. (35) have recently indicated
that the tension applied by the middle ear muscles likely aects
all three of these aspects of sound transmission throughout the
middle ear. In short, the signal sent to the brain in response to an
incoming sound could ultimately reect a mixture of the physical
cues related to the location of the sound itself—the interaural
timing dierences, interaural level dierences, and spectral cues—
and eye position/movement information.
Most neurophysiological studies report signals consistent with
a hybrid code in which information about sound location is
blended in a complex fashion with information about eye position
and movement, both within and across neurons (6, 10, 11, 18,
19, 21, 44). Computational modeling conrms that, in principle,
these complex signals can be read out to produce a signal of sound
location with respect to the eyes (10). However, substantive dif-
ferences do remain between the observations here and such neural
studies, chiey in that the neural investigations have focused pri-
marily on periods of steady xation. A more complete character-
ization of neural signals time- locked to saccades is therefore
needed (8, 63).
Note that this working model diers from a spatial attention
mechanism in which the brain might direct the ears to “listen”
selectively to a particular location in space. Rather, under our
working model, the response to sounds from any location would
be impacted by peripheral eye movement/position dependence in
a consistent fashion across all sound locations. However, such a
system could well work in concert with top–down attention,
which has previously been shown to impact outer hair cells even
when participants are required to xate and not make eye move-
ments (64–70).
Another question concerns whether EMREOs might actually
impair sound localization, specically for brief sounds presented
during an eye movement. We think the answer to this is no.
Boucher et al. (2) reported that perisaccadic sound localization is
quite accurate, which suggests that EMREOs (or their underlying
mechanism) do not impair perception. is is an important
insight because given the rate at which eye movements occur—
about 3/s—and with each associated EMREO signal lasting
100 ms or longer [due to extending past the end of saccades, as
explored by Gruters et al. (25) and (28)], it would be highly prob-
lematic if sounds could not be accurately detected or localized
when they occur in conjunction with saccades. If there is indeed
a step- ramp system underlying the observed oscillations, then
transduction of all sounds will be aected, regardless of when they
occur with respect to saccades. Indeed, recent work supports the
view that sound detection is unaected by saccades (27).
All this being said, a role for EMREOs in computing the spatial
location of sounds with respect to the visual scene does not pre-
clude other roles. Specically, they could also play a role in syn-
chronizing sampling in the temporal domain (e.g., refs. 29–33).
Such a possibility could account for the signicant constant term
C(t) of the regression analysis (Eq. 1 and Fig. 3). is temporally
precise nonspatial component could play a role in resetting or
refreshing of auditory processing across time or coordinating with
the refreshing of the visual image on the retina.
Overall, how brain- controlled mechanisms adjust the signaling
properties of peripheral sensory structures is critical for under-
standing sensory processing as a whole. Auditory signals are known
to adjust the sensitivity of the visual system via sound- triggered
pupil dilation (71), indicating that communication between these
two senses is likely to be a two- way street. e functional impact
of such communication at low- level stages is yet to be fully
explored and may have implications for how individuals compen-
sate when the information from one sensory system is inadequate,
either due to natural situations such as noisy sound environments
or occluded visual ones or due to physiological impairments in
one or more sensory systems.
Data, Materials, and Software Availability.
Anonymized microphone record-
ings, eye movements and associated parameters data have been deposited in
the Figshare+ repository (72).
ACKNOWLEDGMENTS.
We are grateful to Dr. Matthew Cooper, Dr. Kurtis Gruters,
Jesse Herche, Dr. David Kaylie, Dr. Jeff Mohl, Dr. Shawn Willett, Meredith Schmehl,
Dr. Jonathan Siegel, Chadbourne Smith, Dr. David Smith, Justine Shih, Chloe
Weiser, and Tingan Zhu for discussions and other assistance concerning this pro-
ject. This work was supported by NIH (National Institute on Deafness and Other
Communication Disorders) grant DC017532 to J.M.G.
Author aliations: aDepartment of Psychology and Neuroscience, Duke University,
Durham, NC 27708; bDepartment of Neurobiology, Duke University, Durham, NC 27710;
cCenter for Cognitive Neuroscience, Duke University, Durham, NC 27708; dDuke Institute
for Brain Sciences, Duke University, Durham, NC 27708; eDepartment of Otolaryngology,
University of Southern California, Los Angeles, CA 90007; fDepartment of Computer
Science, Duke University, Durham, NC 27708; and gDepartment of Biomedical Engineering,
Duke University, Durham, NC 27708
1. J. M. Groh, D. L. Sparks, Two models for transforming auditory signals from head- centered to eye-
centered coordinates. Biol. Cybernetics 67, 291–302 (1992).
2. L. Boucher, J. M. Groh, H. C. Hughes, Afferent delays and the mislocalization of perisaccadic stimuli.
Vis. Res. 41, 2631–2644 (2001).
3. R. R. Metzger, O. A. Mullette- Gillman, A. M. Underhill, Y. E. Cohen, J. M. Groh, Auditory saccades from
different eye positions in the monkey: Implications for coordinate transformations. J. Neurophysiol.
92, 2622–2627 (2004).
4. V. C. Caruso, D. S. Pages, M. A. Sommer, J. M. Groh, Compensating for a shifting world: Evolving
reference frames of visual and auditory signals across three multimodal brain areas. J. Neurophysiol.
126, 82–94 (2021).
5. K. M. Fu et al., Timing and laminar profile of eye- position effects on auditory responses in primate
auditory cortex. J. Neurophysiol. 92, 3522–3531 (2004).
6. J. X. Maier, J. M. Groh, Comparison of gain- like properties of eye position signals in inferior
colliculus versus auditory cortex of primates. Front. Integr. Neurosci. 4, 121–132 (2010).
7. U. Werner- Reiss, K. A. Kelly, A. S. Trause, A. M. Underhill, J. M. Groh, Eye position affects activity in
primary auditory cortex of primates. Curr. Biol. 13, 554–562 (2003).
8. D. A. Bulkin, J. M. Groh, Distribution of eye position information in the monkey inferior colliculus.
J. Neurophysiol. 107, 785–795 (2012).
9. D. A. Bulkin, J. M. Groh, Distribution of visual and saccade related information in the monkey
inferior colliculus. Front. Neural Circuits 6, 61 (2012).
10. J. M. Groh, A. S. Trause, A. M. Underhill, K. R. Clark, S. Inati, Eye position influences auditory
responses in primate inferior colliculus. Neuron 29, 509–518 (2001).
11. K. K. Porter, R. R. Metzger, J. M. Groh, Representation of eye position in primate inferior colliculus.
J. Neurophysiol. 95, 1826–1842 (2006).
12. M. P. Zwiers, H. Versnel, A. J. Van Opstal, Involvement of monkey inferior colliculus in spatial
hearing. J. Neurosci. 24, 4145–4156 (2004).
13. G. S. Russo, C. J. Bruce, Frontal eye field activity preceding aurally guided saccades. J. Neurophysiol.
71, 1250–1253 (1994).
14. O. A. Mullette- Gillman, Y. E. Cohen, J. M. Groh, Motor- related signals in the intraparietal cortex
encode locations in a hybrid, rather than eye- centered, reference frame. Cerebral. Cortex. 19,
1761–1775 (2009).
15. O. A. Mullette- Gillman, Y. E. Cohen, J. M. Groh, Eye- centered, head- centered, and complex coding of
visual and auditory targets in the intraparietal sulcus. J. Neurophysiol. 94, 2331–2352 (2005).
16. Y. E. Cohen, R. A. Andersen, Reaches to sounds encoded in an eye- centered reference frame. Neuron
27, 647–652 (2000).
17. B. Stricanne, R. A. Andersen, P. Mazzoni, Eye- centered, head- centered, and intermediate coding of
remembered sound locations in area LIP. J. Neurophysiol. 76, 2071–2076 (1996).
18. J. Lee, J. M. Groh, Auditory signals evolve from hybrid- to eye- centered coordinates in the primate
superior colliculus. J. Neurophysiol. 108, 227–242 (2012).
19. M. F. Jay, D. L. Sparks, Auditory receptive fields in primate superior colliculus shift with changes in
eye position. Nature 309, 345–347 (1984).
20. M. F. Jay, D. L. Sparks, Sensorimotor integration in the primate superior colliculus. I. Motor
convergence. J. Neurophysiol. 57, 22–34 (1987).
21. M. F. Jay, D. L. Sparks, Sensorimotor integration in the primate superior colliculus. II. Coordinates of
auditory signals. J. Neurophysiol. 57, 35–55 (1987).
22. J. C. Zella, J. F. Brugge, J. W. Schnupp, Passive eye displacement alters auditory spatial receptive
fields of cat superior colliculus neurons. Nat. Neurosci. 4, 1167–1169 (2001).
23. L. C. Populin, D. J. Tollin, T. C. Yin, Effect of eye position on saccades and neuronal responses to acoustic
stimuli in the superior colliculus of the behaving cat. J. Neurophysiol. 92, 2151–2167 (2004).
24. P. H. Hartline, R. L. Vimal, A. J. King, D. D. Kurylo, D. P. Northmore, Effects of eye position on auditory
localization and neural representation of space in superior colliculus of cats. Exp. Brain Res. 104,
402–408 (1995).
PNAS 2023 Vol. 120 No. 48 e2303562120 https://doi.org/10.1073/pnas.2303562120 11 of 11
25. K. G. Gruters et al., The eardrums move when the eyes move: A multisensory effect on the mechanics
of hearing. Proc. Natl. Acad. Sci. U.S.A. 115, E1309–E1318 (2018).
26. S. N. Lovich et al., Conserved features of eye movement related eardrum oscillations (EMREOs)
across humans and monkeys. Philos. Trans. R. Soc. Lond. Series B Biol. Sci. 378, 20220340 (2023).
27. F. Bröhl, C. Kayser, Detection of spatially- localized sounds is robust to saccades and concurrent eye
movement- related eardrum oscillations (EMREOs). J. Neurosci., in press.
28. C. D. King et al., Individual similarities and differences in eye- movement- related eardrum
oscillations (EMREOs). Hear. Res., in press.
29. M. N. O’Connell et al., The role of motor and environmental visual rhythms in structuring auditory
cortical excitability. iScience 23, 101374 (2020).
30. A. Barczak et al., Dynamic modulation of cortical excitability during visual active sensing. Cell Rep.,
3447–3459.e3 (2019).
31. M. H. A. Köhler, N. Weisz, Cochlear theta activity oscillates in phase opposition during interaural
attention. J. Cogn. Neurosci. 35, 588–602 (2023).
32. M. Leszczynski et al., Saccadic modulation of neural excitability in auditory areas of the neocortex.
Curr. Biol. 33, 1185–1195.e6 (2023).
33. M. H. A. Köhler, G. Demarchi, N. Weisz, Cochlear activity in silent cue- target intervals shows a theta-
rhythmic pattern and is correlated to attentional alpha and theta modulations. BMC Biol. 19, 48 (2021).
34. L. Gallagher, M. Diop, E. S. Olson, Time- domain and frequency- domain effects of tensor tympani
contraction on middle ear sound transmission in gerbil. Hear Res. 405, 108231 (2021).
35. N. H. Cho, M. E. Ravicz, S. Puria, Human middle- ear muscle pulls change tympanic- membrane shape
and low- frequency middle- ear transmission magnitudes and delays. Hear Res. 430, 108721 (2023).
36. M. M. Monti, Statistical analysis of fMRI time- series: A critical review of the GLM approach. Front.
Hum. Neurosci. 5, 28 (2011).
37. J. C. Middlebrooks, D. M. Green, Sound localization by human listeners. Annu. Rev. Psychol. 42,
135–159 (1991).
38. J. Hebrank, D. Wright, Are two ears necessary for localization of sound sources on the median plane?
J. Acoust. Soc. Am. 56, 935–938 (1974).
39. J. Hebrank, D. Wright, Spectral cues used in the localization of sound sources on the median plane.
J. Acoust. Soc. Am. 56, 1829–1834 (1974).
40. E. A. Macpherson, A. T. Sabin, Vertical- plane sound localization with distorted spectral cues.
Hear Res. 306, 76–92 (2013).
41. J. Blauert, Spatial Hearing (MIT Press, Cambridge, MA, 1997).
42. J. M. Groh, Making Space: How the Brain Knows Where Things Are (Harvard University Press,
Cambridge, MA, 2014).
43. R. Ege, A. J. V. Opstal, M. M. Van Wanrooij, Accuracy- precision trade- off in human sound localisation.
Sci. Rep. 8, 16399 (2018).
44. V. C. Caruso, D. S. Pages, M. A. Sommer, J. M. Groh, Compensating for a shifting world: A quantitative
comparison of the reference frame of visual and auditory signals across three multimodal brain
areas. J Neurophysiol. 126, 82–94 (2021).
45. H. Abbasi et al., Audiovisual temporal recalibration modulates eye movement- related eardrum
oscillations. International Multisensory Research Forum Brussels, Belgium, 27–30 June 2023.
Abstract 37. https://imrf2023.sciencesconf.org/data/pages/IMRF23_FullProgram.pdf.
46. I. J. Hung, P. Dallos, Study of the acoustic reflex in human beings. I. Dynamic characteristics.
J. Acoust. Soc. Am. 52, 1168–1180 (1972).
47. E. S. Mendelson, A sensitive method for registration of human intratympanic muscle reflexes.
J. Appl. Physiol. 11, 499–502 (1957).
48. S. A. Gelfand, “The contralateral acoustic reflex” in The Acoustic Reflex: Basic Principles and Clinical
Applications, S. Silman, D. D. Dirks, Eds. (Academic Press, New York, NY, 1984), pp. 137–186.
49. W. E. Brownell, C. R. Bader, D. Bertrand, Y. de Ribaupierre, Evoked mechanical responses of isolated
cochlear outer hair cells. Science 227, 194–196 (1985).
50. J. J. Guinan Jr., “Cochlear mechanics, otacoustic emisssions, and medial olivocochlear efferents:
Twenty years of advances and controversies along with areas ripe for new work” in Perspectives on
Auditory Research, (Springer, New York, 2014), pp. 229–246.
51. J. J. Guinan Jr., Olivocochlear efferents: Anatomy, physiology, function, and the measurement of
efferent effects in humans. Ear Hear. 27, 589–607 (2006).
52. S. Mukerji, A. M. Windsor, D. J. Lee, Auditory brainstem circuits that mediate the middle ear muscle
reflex. Trends Amplif. 14, 170–191 (2010).
53. N. P. Cooper, J. J. Guinan Jr., Efferent- mediated control of basilar membrane motion. J. Physiol.
576, 49–54 (2006).
54. M. C. Liberman, J. J. Guinan Jr., Feedback control of the auditory periphery: Anti- masking effects
of middle ear muscles vs. olivocochlear efferents. J. Commun. Disord. 31, 471–482; quiz 483; 553
(1998).
55. R. Galambos, Suppression of auditory nerve activity by stimulation of efferent fibers to cochlea.
J Neurophysiol. 19, 424–437 (1956).
56. S. N. Schlebusch et al., Changes in saccade- related eardrum oscillations after surgical denervation
of the stapedius muscle. Soc. Neurosci. Abstr. 578 (2019).
57. S. N. Schlebusch et al., Changes in saccade- related eardrum oscillations after surgical denervation
of the stapedius muscle. Assoc. Res. Otolaryngol. Abstr. PD89 (2020).
58. C. King, S. Lovich, D. Kaylie, C. Shera, J. Groh, Measuring the impact of auditory system impairments
on eye- movement- related eardrum oscillations (EMREOs). Assoc. Res. Otolaryngol. Abstr. Orlando,
FL, February 2023 TU177 (2023).
59. M. Takahashi, Y. Shinoda, Brain stem neural circuits of horizontal and vertical saccade systems and
their frame of reference. Neuroscience 392, 281–328 (2018).
60. A. F. Fuchs, C. R. Kaneko, C. A. Scudder, Brainstem control of saccadic eye movements. Annu. Rev.
Neurosci. 8, 307–337 (1985).
61. D. L. Sparks, L. E. Mays, J. D. Porter, Eye movements induced by pontine stimulation: Interaction with
visually triggered saccades. J. Neurophysiol. 58, 300–318 (1987).
62. B. L. Guthrie, J. D. Porter, D. L. Sparks, Corollary discharge provides accurate eye position information
to the oculomotor system. Science 221, 1193–1195 (1983).
63. K. K. Porter, R. R. Metzger, J. M. Groh, Visual- and saccade- related signals in the primate inferior
colliculus. Proc. Natl. Acad. Sci. U.S.A. 104, 17855–17860 (2007).
64. P. H. Delano, D. Elgueda, C. M. Hamame, L. Robles, Selective attention to visual stimuli reduces
cochlear sensitivity in chinchillas. J. Neurosci. 27, 4146–4153 (2007).
65. A. W. Harkrider, C. D. Bowers, Evidence for a cortically mediated release from inhibition in the
human cochlea. J. Am. Acad. Audiol. 20, 208–215 (2009).
66. S. Srinivasan, A. Keil, K. Stratis, K. L. Woodruff Carr, D. W. Smith, Effects of cross- modal selective
attention on the sensory periphery: Cochlear sensitivity is altered by selective attention.
Neuroscience 223, 325–332 (2012).
67. S. Srinivasan et al., Interaural attention modulates outer hair cell function. Eur. J. Neurosci. 40,
3785–3792 (2014).
68. K. P. Walsh, E. G. Pasanen, D. McFadden, Selective attention reduces physiological noise in the
external ear canals of humans. I: Auditory attention. Hear Res. 312, 143–159 (2014).
69. K. P. Walsh, E. G. Pasanen, D. McFadden, Changes in otoacoustic emissions during selective auditory
and visual attention. J. Acoust. Soc. Am. 137, 2737–2757 (2015).
70. A. Wittekindt, J. Kaiser, C. Abel, Attentional modulation of the inner ear: A combined otoacoustic
emission and EEG study. J. Neurosci. 34, 9995–10002 (2014).
71. A. D. Bala, T. T. Takahashi, Pupillary dilation response as an indicator of auditory discrimination in the
barn owl. J. Comp. Physiol. A 186, 425–434 (2000).
72. J. M. Groh, D. L. K. Murphy, S. N. Lovich, C. D. King, Eye movement- related eardrum oscillations
(EMREOs) dataset and supporting code. Figshare+. https://doi.org/10.25452/figshare.
plus.24470548. Deposited 1 November 2023.