ArticlePDF Available

Abstract and Figures

Animals live in cluttered auditory environments, where sounds arrive at the two ears through several paths. Reflections make sound localization difficult, and it is thought that the auditory system deals with this issue by isolating the first wavefront and suppressing later signals. However, in many situations, reflections arrive too early to be suppressed, for example, reflections from the ground in small animals. This paper examines the implications of these early reflections on binaural cues to sound localization, using realistic models of reflecting surfaces and a spherical model of diffraction by the head. The fusion of direct and reflected signals at each ear results in interference patterns in binaural cues as a function of frequency. These cues are maximally modified at frequencies related to the delay between direct and reflected signals, and therefore to the spatial location of the sound source. Thus, natural binaural cues differ from anechoic cues. In particular, the range of interaural time differences is substantially larger than in anechoic environments. Reflections may potentially contribute binaural cues to distance and polar angle when the properties of the reflecting surface are known and stable, for example, for reflections on the ground.
(Color online) The effect of a reflection on binaural cues in a simple case: An acoustically transparent head. Panel A shows that, for a pure tone of frequency f , the interference between the direct and reflected sound is constructive when the delay is D 1⁄4 n = f (top), and destructive when D 1⁄4 n = f þ 1 = 2 (bottom). The two signals can be represented as unit vectors on a circle (right), where the angle represents the phase of the reflected sound (dashed black line, “reflected”) relative to the direct sound (solid gray arrow). The total signal is the vector sum (dashed line, “total”). Its angle is therefore the phase of the total signal and its length is its amplitude. A short arrow would correspond to a destructive interference. When the phase of the reflected sound goes beyond p (panel B), the phase of the total signal jumps from p = 2 to À p = 2 (panel B). With a reflection at a wall (panel C), the delay between direct and reflected sounds at the left ear (solid arrow) is different from that at the right ear (dashed arrow). Thus, in panel D, the phase of the reflected sound for the left and right ears, relative to the direct sounds, are represented by distinct solid and dashed black arrows, respectively (left). The IPD change due to the reflection is the angle between the two summed vectors. It changes discontinuously when the phase of one monaural signal exceeds p (right). With a reflection at the ground (panel E), the delays between reflected and direct sounds are similar but not exactly equal at the left and right ear (solid and dashed thick lines, respectively). Panel F shows that the IPD change due to the reflection is generally small (left), except when a discontinuity occurs, when the phase of one monaural signal exceeds p (right).
… 
Content may be subject to copyright.
The impact of early reflections on binaural cues
Boris Goure´vitch
a)
and Romain Brette
b)
Equipe Audition, De´ partement d’Etudes Cognitives, Ecole Normale Supe´rieure, 29, rue d’Ulm,
75005 Paris, France
(Received 14 June 2011; revised 16 May 2012; accepted 16 May 2012)
Animals live in cluttered auditory environments, where sounds arrive at the two ears through
several paths. Reflections make sound localization difficult, and it is thought that the auditory
system deals with this issue by isolating the first wavefront and suppressing later signals. However,
in many situations, reflections arrive too early to be suppressed, for example, reflections from the
ground in small animals. This paper examines the implications of these early reflections on binaural
cues to sound localization, using realistic models of reflecting surfaces and a spherical model of
diffraction by the head. The fusion of direct and reflected signals at each ear results in interference
patterns in binaural cues as a function of frequency. These cues are maximally modified at
frequencies related to the delay between direct and reflected signals, and therefore to the spatial
location of the sound source. Thus, natural binaural cues differ from anechoic cues. In particular,
the range of interaural time differences is substantially larger than in anechoic environments.
Reflections may potentially contribute binaural cues to distance and polar angle when the properties
of the reflecting surface are known and stable, for example, for reflections on the ground.
V
C2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4726052]
PACS number(s): 43.20.El, 43.66.Pn, 43.66.Qp [MAA] Pages: 9–27
I. INTRODUCTION
To localize sound sources, many species, including
humans, rely on subtle differences in the signals arriving at
the two ears. The ear closer to the source receives the sound
earlier and with a higher level than the other ear. These inter-
aural time differences (ITDs) and interaural level differences
(ILDs) are produced by sound propagation and diffraction of
sounds by the head, pinnae, and body. They vary systemati-
cally with the location of the sound source. The relationship
between these binaural cues and sound location have been
described in many species, mainly using acoustical measure-
ments of head-related transfer functions (HRTFs) in anechoic
chambers, in order to minimize the disturbances due to reflec-
tions. However, the acoustical environments animals live in
contain many objects that produce reflections, such as trees,
natural or artificial walls, and the ground. Even in the open
air, with no obstacle, at least one reflection is produced by
the ground, and its texture can be very variable, e.g. grass,
snow, soil, or asphalt. In principle, these reflections can affect
binaural cues, as pointed out by McFadden (1973).
Nevertheless, humans can maintain good localization
and segregation abilities in echoic environments (Blauert,
1997;Freyman et al., 2001;Litovsky et al., 1999;Zurek,
1987). This robustness to reflections is thought to be medi-
ated by the precedence effect (Litovsky et al., 1999). When a
sound and a reflection are separated by less than 1 ms, a
fused sound is perceived, with an intermediate localization
(summing localization). When the delay to the reflection is
between about 1 and 5 ms, the perceived source location is
dominated by the location of the leading sound - this prop-
erty is called the “law of the first wavefront” (Blauert, 1997;
Shinn-Cunningham et al., 1993;Wallach et al., 1949). When
the delay is longer than about 5–10 ms, the two sounds
become separately audible (breakdown of fusion) and their
two distinct localizations are perceived. Note that the break-
down of fusion is longer (50 ms) for speech or music than
for transient sounds (Litovsky et al., 1999;Lochner and
Burger, 1964).
Similar findings have been reported in a number of spe-
cies, with delays in the same range: cats (Cranford, 1982;
Populin and Yin, 1998), rats (Kelly, 1974), crickets (Wytten-
bach and Hoy, 1993), owls (Keller and Takahashi, 1996a;
Spitzer and Takahashi, 2006), and birds (Dent and Dooling,
2004). For example, in cats, localization performance
degrades for delays below 0.5 ms, which is consistent with
summing localization (Cranford, 1982;Populin and Yin,
1998;Tollin and Yin, 2003). Neural correlates of the prece-
dence effect have also been seen in recordings in the inferior
colliculus and auditory cortex of cats: for example, with
clicks separated by more than 2 ms, neural responses to the
lagging click are suppressed (Dent et al., 2009;Mickey and
Middlebrooks, 2001;Yin, 1994).
Thus, many reflections are either suppressed or sepa-
rately processed by the auditory system. However, not all
reflections can be suppressed by the auditory system. Con-
sider the situation illustrated in Fig. 1. The animal faces a
sound source, with its ears at a distance pfrom the ground.
Two signals arrive at the animal: the direct sound and its
reflection at the ground. Since the shortest path between two
points is a straight line, the path length of the reflection can
be no more than the distance from the sound source plus 2p
(see Fig. 1). Thus, the time delay of the reflection is always
a)
Current address: Centre de Neurosciences Paris-Sud (CNPS), UMR CNRS
8195, Universite´ Paris-Sud, Baˆ timent 446, rue Claude Bernard, 91405
Orsay Cedex, France.
b)
Author to whom correspondence should be addressed. Also at: Laboratoire
Psychologie de la Perception, CNRS, Universite´ Paris Descartes, 45, rue des
Saints-Pe`res 75006 Paris, France. Electronic mail : romain.brette@ens.fr
J. Acoust. Soc. Am. 132 (1), July 2012 V
C2012 Acoustical Society of America 90001-4966/2012/132(1)/9/19/$30.00
Author's complimentary copy
shorter than 2p=c, where c343 m=s is the speed of sound
in air at 20 C. For example, for a guinea pig, which is about
8 cm tall, this delay is always shorter than 250 ls for all
sounds produced on the horizontal plane. This is well within
the range when the two sounds are perceptually fused.
Therefore, the binaural cues that are available to the animal
should be heavily affected by reflections. This hypothesis is
supported by real recordings of ILDs in gerbils after reflec-
tion by a plywood floor, which show interference patterns
(Maki et al., 2003).
In this paper, we first examine early reflections in simple
geometrical models to understand how likely they are to pro-
duce reflected waves within the “fusion” range for several
species (Sec. II). We then examine the impact of these
reflections on binaural cues, first when diffraction is absent
(Sec. III A), and then using a spherical head model with real-
istic models of natural textures (Sec. III B). We find that
ILDs are more modified than ITDs, and that the variation in
ITD can be systematically related to the distance and polar
angle from the source (Sec. IV), potentially providing a
localization cue. We also notice that variations of ITDs and
ILDs due to reflections substantially extend the range of bin-
aural cues that the auditory system has to manage (Sec. V).
Finally, we discuss the implications of these results for
neural coding of sound location, binaural localization cues,
and psychophysical experiments.
II. EARLY REFLECTIONS AND THE PRECEDENCE
EFFECT
We consider two simple situations describing the proxim-
ity of a human or an animal to the ground or to a wall (Fig. 2).
Let Sbe a sound source at distance dfrom the head. In the
spherical coordinate system typically used in localization
studies, Shas a lateral angle (azimuth) u
S
and a polar angle
(elevation, latitude) h
S
relative to the center of the head (see
Appendix Afor all symbols). When the sound wave propagat-
ing from Sencounters an obstacle such as the ground [Fig.
2(A)] or a wall [Fig. 2(B)], it is partly reflected and partly
absorbed. The incidence angle of the reflected sound wave is
that of a source S* which “mirrors” Srelative to the obstacle.
As a consequence, for a reflection at the ground, the reflected
and direct sound waves have the same lateral angle; for a
reflection at a vertical wall, they have the same polar angle.
In the case of a reflection at the ground, the path length
d* of the reflection is, at most, the distance of the sound
source dplus 2p(see Fig. 1). Therefore the delay of the
reflected sound is no more than 2p/c. This upper bound cor-
responds to the case when the source is directly above the
head (i.e., polar angle ¼90), but in many natural situa-
tions, this delay is shorter, in particular if the source is far
from the ears or close to the ground. This is illustrated in
Fig. 3, which shows the computed delay between direct and
reflected sounds as a function of the distance from the
source to head and the polar or lateral angle of the sound
source to the head (see Appendix Bfor the calculations).
For the situation of a zero polar angle, when the source is
farther than 10 m then this delay is less than 0.15 ms for ani-
mals smaller than cats [Fig. 3(B)], and less than 1 ms for
humans [Fig. 3(A)]. The same reasoning applies to a reflec-
tion from a wall, where the distance to the ground is
replaced by the distance to the wall.
More generally, consider a source and a receiver at dis-
tance dfrom each other. The set of reflection points such
FIG. 1. (Color online) The maximum difference in path length between the
direct and reflected sound. The source is at distance dfrom the ears, which
are at height pfrom the ground. The direct path length is d(solid black
line), and the reflected path (dashed black line) is shorter than the dotted
path, which has length dþ2p. Thus, the difference in path length is always
smaller than 2p.
FIG. 2. (Color online) Geometrical models of reflections. Panels A and B show two basic models of sound reflections by the ground and a wall. The polar
angle (A) or lateral angle (B) of the sound source is u
S
(A) or h
S
(B) and that of the reflected sound is u
S
*
(A) or h
S
*
(B): pis the distance between the ears and
the obstacle. d* is the reflected path length. Panel C shows, for fixed locations of the source and receiver, that the set of reflection points on the horizontal plane
that produce a fixed delay D¼(d*d)=cforms an ellipse with foci at the source and receiver.
10 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
that reflected path length is a fixed quantity d* is an ellipse
1
with foci at the source and receiver [Fig. 2(C)]. In three di-
mensions, this would be an ellipsoid. All obstacles that are
tangent to this ellipse produce reflections with delay
D¼(d*d)=c. For example, delays shorter than D¼1ms
are produced by all objects within an ellipse that pass
through the two external points aligned with the source and
receiver at distance Dc=2¼17 cm from them [Fig. 2(C)].
In Table I, we have listed the maximum delay for a
reflection at the ground for various different species as well
as the values of the delays when the source is 1.5 m away
from the ears. For a human, for whom the average distance
between ears and ground is 1.7 m, the reflected waves are
often delayed by several milliseconds [for example, it is
6.5 ms at a distance of 1.5 m; Fig. 3(A)]. This is within the
range of echo suppression in the precedence effect. But the
ears of small mammals such as guinea pigs or gerbils are
only a few centimeters away from the ground. For instance,
even if the gerbil decides to stand up to view its environ-
ment, it is only 12 cm high in this fully erect posture, which
means a maximum delay of 730 ls for a reflection at the
ground. This upper bound corresponds to the situation where
the source is at a polar angle of 90, but when the source is
at a polar angle of 0, i.e. at the same height as the animal,
the delay is generally much shorter because the sound hits
the ground with grazing incidence. This makes the reflected
path length very close to the direct path length geometri-
cally. For example, the delay is only 56 ls for a source at
FIG. 3. The computed delay in milliseconds between the direct and the ground-reflected (panels A, B) or wall-reflected (panels C, D) sound waves as a func-
tion of the distance from the head and polar angle (A, B) or lateral angle (C, D) of the source. The ground is assumed to be at a distance of p¼1.7 m (A) or
p¼0.2 m (B) from the head. The hatched area represents the geometrically impossible cases where the source would be below the ground. The contours for
different delays are shown as dashed lines.
TABLE I. The computed time delay between the front and reflected sound waves using a ground reflection model for typical ear-ground distances in several
species (top line), and for two different source-ground distances (equivalently, two different polar angles).
Species Human
Dog
(Labrador)
Dog
(Bulldog)
Cat=
Marmoset Gerbil
Guinea
Pig Mouse
Ear-ground 1.70 0.75 0.4 0.2 0.12 0.08 0.02
distance (m) (at most)
Front-reflected
wave delay (ms)
Maximum delay (ms) 10.3 4.5 2.4 1.2 0.73 0.48 0.12
dist.: 1.5 m elev.: 0(source
height ¼ear-ground distance)
6.5 1.82 0.58 0.154 0.056 0.025 0.0016
dist.: 1.5 m 4.62 2.3 1.3 0.66 0.39 0.26 0.07
source height ¼1 m (elev. in
brackets)
(25)(9
) (22) (28) (30) (32) (33)
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 11
Author's complimentary copy
1.5 m from the gerbil. In addition, because they are small,
these mammals often move with their ears close to reflecting
surfaces, such as embankment slopes. Even for cats (about
20 cm high), the delays are shorter than 1.2 ms in all configu-
rations, and shorter than 1 ms in most natural cases [Fig.
3(B), white line]. Even if the fusion threshold were lower for
small mammals than for humans, short delays such as 56 ls
are very likely to fall below their fusion threshold. There-
fore, the precedence effect cannot account for the processing
of early reflections in ecological situations for small mam-
mals. Instead, the binaural cues that are available to these
animals should include the impact of early reflections. For
this reason, we now focus on small mammals, but we shall
come back to the case of humans in the discussion.
III. IMPACT OF REFLECTIONS ON BINAURAL CUES
A. A simple case: Rigid surface and no diffraction
We start by considering an elementary situation with a
reflection at a rigid surface but no diffraction effects by the
head or similar obstacle. We also neglect the attenuating
effect of distance. Our treatment is similar to Sec. 3.1 in
Blauert, 1997 (“phasor diagrams” in Fig. 3.8), which also
includes a few relevant references such as Leakey (1959)
and Wendt (1963). Suppose the source signal is a pure tone
with frequency f, which we represent as a complex signal
e2pift. The ear receives the sum of the direct sound (delay d)
and reflected sound (delay d*):
SðtÞ¼e2pifðtdÞþe2pifðtdÞ;(1)
which is proportional to ð1þe2pifDÞ, where D¼d*dis
the delay of the reflection to the direct sound. This results in
an interference which may be constructive (fD¼n,nbeing
an integer) or destructive ( fD¼nþ1=2) [see Fig. 4(A)].
More precisely, for a fixed delay D, the level and phase of
the summed signal are periodic functions of the tone fre-
quency f, with spectral period 1=D. At the frequency of each
destructive interference [ f¼1=(2D)þn=D], the level drops
to 0 (i.e., 1 dB), but more interestingly the phase abruptly
shifts from p=2top=2 [see Fig. 4(B)]. This corresponds to
a sign change in the summed vector near the interference
frequency.
2
We now examine the impact of monaural interferences on
binaural cues. In the case of a vertical wall, the delay between
the direct and reflected sounds is not the same at the two ears
[see Fig. 4(C)]. The signal at the left ear is proportional to
SLðtÞ/ð1þe2pifDLÞ(2)
and the signal at the right ear is proportional to
SRðtÞ/ð1þe2pifDRÞe2pi f ITD:(3)
Thus, to understand the consequences on binaural cues, we
need to compare the vectors ð1þe2pif DLÞand ð1þ
e2pif DRÞ[see Fig. 4(D)]. Assuming that the two delays are
similar (i.e., D
L
D
R
), the interaural phase difference (IPD)
and the ILD are approximately periodic functions of f, with
spectral period D(D
L
þD
R
)=2. As in the monaural case,
two things occur at the frequencies of destructive interfer-
ences (f¼1=(2D
L
)þn=D
L
and f¼1=(2D
R
)þn=D
R
): the
ILD goes to 61, and the IPD changes discontinuously.
Between the interference frequencies, the IPD change due to
the interferences is close to p. Thus, interferences cause
large variations in ILD and discontinuities in IPD.
In the case of a reflection at the ground, the delays D
L
and D
R
are also slightly different. For example, consider a
sound source at polar angle 0and lateral angle 90[see
Fig. 4(E)]. If dis the distance between the sound to the left
ear and lis the interaural distance, then D
R
¼D
L
(dþl)=d.
This is a small difference, except for very close sources, and
therefore the IPD and ILD should not be very degraded in
general. However, interferences are still present and cause
large changes in ILD and discontinuities in IPD near
f¼1=(2D)þn=D, as for a reflection at the wall [see Fig.
4(F)]. In particular, the IPD changes by pnear interference
frequencies.
Thus in this simple situation, we predict that with a
reflection the change in IPD and ILD should be a periodic
FIG. 4. (Color online) The effect of a reflection on binaural cues in a simple
case: An acoustically transparent head. Panel A shows that, for a pure tone of
frequency f, the interference between the direct and reflected sound is con-
structive when the delay is D¼n=f(top), and destructive when D¼
n=fþ1=2 (bottom). The two signals can be represented as unit vectors on a
circle (right), where the angle represents the phase of the reflected sound
(dashed black line, “reflected”) relative to the direct sound (solid gray arrow).
The total signal is the vector sum (dashed line, “total”). Its angle is therefore
the phase of the total signal and its length is its amplitude. A short arrow
would correspond to a destructive interference. When the phase of the
reflected sound goes beyond p(panel B), the phase of the total signal jumps
from p=2top=2 (panel B). With a reflection at a wall (panel C), the delay
between direct and reflected sounds at the left ear (solid arrow) is different
from that at the right ear (dashed arrow). Thus, in panel D, the phase of the
reflected sound for the left and right ears, relative to the direct sounds, are
represented by distinct solid and dashed black arrows, respectively (left). The
IPD change due to the reflection is the angle between the two summed vec-
tors. It changes discontinuously when the phase of one monaural signal
exceeds p(right). With a reflection at the ground (panel E), the delays
between reflected and direct sounds are similar but not exactly equal at the
left and right ear (solid and dashed thick lines, respectively). Panel F shows
that the IPD change due to the reflection is generally small (left), except
when a discontinuity occurs, when the phase of one monaural signal exceeds
p(right).
12 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
function of frequency with spectral period 1=D, with maxi-
mal changes at the frequencies of destructive interferences: f
¼1=(2D)þn=D. We now consider more realistic models of
the auditory environment.
B. Natural surfaces and diffraction effects
1. Natural surfaces
An incident wave on a real surface is partly reflected and
partly absorbed in a way that depends on incidence and fre-
quency. We describe reflections at the ground (the equations
are equivalent for a vertical wall if polar angle uis replaced
by the lateral angle h). As before, we assume that the re-
ceiver is at distance dfrom the source and at distance d*
from the mirrored source, and we assume that the sound
wave is an isotropic spherical wave propagating outward
from a central point. The most widely used approach to
match the boundary conditions for such a wavefront imping-
ing on a plane finite-impedance surface is the Weyl–Van der
Pol solution (Sutherland and Daigle, 1998), where the sound
field at the receiver can be expressed as the sum of direct and
reflected sound fields. If we define P(d, f) as the complex
pressure amplitude at frequency fand distance dfrom the
source (in the absence of reflecting surfaces), then the sound
field P
rec
at the receiver is well approximated by the equation
Prec ¼Pðd;fÞþQðd;f;uSÞPðd;fÞ;(4)
where Qis a spherical reflection factor, a complex-valued
function of frequency, angle and distance. The reflected
sound field Qðd;f;uSÞPðd;fÞdepends on the polar
angle uS, the distance d* (because of the spherical wave hy-
pothesis) and the acoustic properties of the two media (air
and ground), which are frequency dependent. Detailed for-
mulae can be found in Appendix C. Many models have been
used to describe these acoustic properties for typical outdoor
surfaces. We used the Delany–Bazley model (Delany and
Bazley, 1970;Miki, 1990), where these properties are
described with a single parameter, the effective flow resistiv-
ity r. In Appendix C, we list typical values for a number of
natural textures (from Cox and D’Antonio, 2009): high
values correspond to rigid surfaces (e.g., concrete) while low
values correspond to soft textures (e.g., snow).
Figure 5(A) illustrates the general properties of reflec-
tions on natural surfaces. When the ground is soft, a low fre-
quency wave can partially penetrate the surface, which
delays the reflected wave. At higher frequencies, the incident
sound is partly absorbed and the delay is shorter. The absorp-
tion and delay effects are reduced for more rigid surfaces,
i.e., those with a higher resistivity r. With a grazing inci-
dence, the reflection is greater and the delay is longer. These
properties are also shown in Figs. 5(B) and 5(C) for two inci-
dence angles and two natural grounds: grass (r¼10
5
Pa s
m
2
) and sand (r¼610
5
Pa s m
2
).
As in the simple situation described in Sec. III A, the
direct and reflected waves interfere and produce large varia-
tions in level as a function of frequency [see Fig. 6(A)].
Quantitatively, these are not as large as with a rigid surface
because the reflections are partly absorbed. Let us first con-
sider a realistic reflecting surface with reflection factor Q(f)
and neglect diffraction effects. The pressure decreases with
distance as 1=d. Thus the total pressure at the ear for a pure
tone at frequency fis proportional to
Precðd;fÞ¼1
dþQðfÞ
de2pif D¼1
dþjQðfÞj
de2pif ðDþsðfÞÞ;
(5)
where s(f) is the phase delay, in seconds, of Q(f). Thus, level
and phase vary approximately periodically with frequency,
FIG. 5. (Color online) The acousti-
cal properties of natural surfaces.
Panel A shows that low frequencies
penetrate a porous surface deeper
than high frequencies, which pro-
duces a delay Din the reflected
sound. Panel B and C show the am-
plitude and phase delay of Q, respec-
tively, as a function of frequency
(for a sine tone) and for two values
of flow resistivity r. Two incidence
polar angles are used (10,80
) and
are indicated on each curve. The dis-
tance between source and receiver is
assumed to be d¼1.5 m.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 13
Author's complimentary copy
with spectral period 1=D. The reflection factor Q(f) varies
somewhat with frequency, and determines the spectral enve-
lope in Fig. 6(A). The relative amplitude of the interference
pattern varies between
1
d1d
djQj
 (6)
and
1
d1þd
djQj
 (7)
with a first minimum at frequency f
0
¼1=(2(Dþs(f
0
))). For a
hard surface, this is close to f
0
¼1=(2D).
In Table II, we report the simple estimates f
0
¼1=(2D)
for the first interference frequency and f
p
¼1=Dfor the spec-
tral period, and compare them to the accurate values derived
from Eq. (5), for a rigid surface (i.e., r¼1). It can be seen
that the simple estimate of the spectral period f
p
is very accu-
rate for high r, while the first interference frequency f
0
seems to be slightly overestimated. These estimates become
less accurate as rdecreases, i.e., for more porous surfaces.
This is to be expected since porous surfaces introduce an
FIG. 6. (Color online) The pressure after a reflection by a ground or a wall for a point source. Panel A shows the pressure at the receiver if there were no head
and therefore no diffraction. The geometrical parameters of the source following the conventions of Fig. 1are indicated at the top left. The parameters of the
ground- or wall-reflected wave are shown within the plots. Panels B and C show the pressure at the left and right ears after filtering by the HRTFs of the sphere
model, for the ground (B) and wall (C) reflection models. The geometrical parameters of the source are the same as in (A).
TABLE II. Computed values found in the spectra shown in Fig. 6(A) and
estimates f
0
¼1=(2D) for the first interference frequency and f
p
¼1=Dfor
the spectral period (see Secs. III A and III B).
Ground
reflection
Wall
reflection
First notch Computed value r¼10
5
1179 Hz 2051 Hz
f
0
Computed value r¼6.10
5
1418 Hz 2878 Hz
Estimate (r¼1) 1590 Hz 3624 Hz
Spectral Period Computed value r¼10
5
3103 Hz 6606 Hz
f
p
Computed value r¼6.10
5
3151 Hz 6942 Hz
Estimate (r¼1) 3180 Hz 7248 Hz
14 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
additional delay in low frequencies because of the properties
of the reflection factor Q(see Fig. 5).
2. Diffraction effects
The second aspect that we must take into account is the
diffraction of sounds by the head. This effect is described by
head-related transfer functions (HRTFs). The HRTF is the
pressure at the ear divided by the reference free-field pres-
sure at the center of the head, for a source at a given loca-
tion. The head scatters sound waves in a way that depends
on the incidence of the wave relative to the head and the
sound frequency, and therefore HRTFs are functions of fre-
quency, distance, lateral angle, and polar angle. For a sine-
tone frequency f, we can thus define HRTFs with reflections
using the Weyl–Van der Pol equation described above:
Hrecðd;f;ðhS;uSÞÞ ¼ Hðd;f;ðhS;uSÞÞ
þQ:Hðd;f;ðhS;uSÞÞ:(8)
The HTRF for the left and right ears will be denoted HLand
HR, respectively. In the following calculations, we use a
spherical head model with shifted ears, for which the diffrac-
tion function is completely known and has been extensively
described, for instance, in (Duda and Martens, 1998;Ono
et al.,2008). Details are given in Appendix D.Thegeometri-
cal properties of the spherical head model were chosen to give
a match to the shape of a guinea pig head, with ears at the
back and top: a head radius of 2cm and ears at a lateral angle
of 6110and a polar angle of þ30. The head was assumed
to be at p¼8 cm from the ground or wall, corresponding to
the average distance between ears and ground in guinea pigs.
The results are shown in Figs. 6(B) and 6(C). As can be
seen, the monaural interference patterns are qualitatively simi-
lar when diffraction effects are introduced. However, these
introduce additional delays which must be taken into account
in our estimates of f
0
and f
p
. Indeed, for the right ear, head fil-
tering introduces a phase shift /S
RðfÞ¼argðHRðf;d;ðhS;uSÞÞÞ
for the direct wave and /S
RðfÞ¼argðHRðf;d;ðhS;uSÞÞÞ for
the reflected wave. The phase difference /S
RðfÞ/S
RðfÞadds
to the phase of the reflection wave in Eq. (1). Thus, the first
minima in the sound spectrum at the right ear should occur at
frequency f
0
such that 2pf0Dþ2pf0sðf0Þþ/S
Rðf0Þ/S
Rðf0Þ
¼p. The phase shift induced by head filtering can be approxi-
mated by the time needed by the sound wave to travel around
the spherical head from its impact point with the head to the
right ear. This distance is called an “orthodromic
3
distance
and is the distance rOD(p
1
,p
2
)betweentwopointsp
1
and p
2
on a sphere of radius r(Deza and Deza, 2006), see Fig. 7and
Appendix Dfor details. We can therefore approximate the
phase difference /S
RðfÞ/S
RðfÞas follows:
/S
RðfÞ/S
RðfÞ2pfr
cðODððhS;uSÞ;ðhM;uMÞÞ
ODððhS;uSÞ;ðhM;uMÞÞÞ
2pfD/R;ð9Þ
where D/Ris a propagation delay that does not depend on
frequency. This is applicable when the wavelength is small
compared to the head size. Thus our estimates of interfer-
ence parameters for a rigid surface with r¼1 are now
f0¼1
2ðDþD/RÞand fp¼1
DþD/R
:(10)
Table III shows that the interference spectral period f
p
is
very well approximated by this formula, while the first inter-
ference frequency f
0
is overestimated. Part of the explanation
is probably the additional delays in low frequency intro-
duced by diffraction (Kuhn, 1977), which we did not take
into account in our formula.
3. Impact on ITDs and ILDs
As is shown in Fig. 8, these monaural interferences result in
oscillations in ILD and ITD as a function of frequency. When
the sound is reflected on a vertical wall [Figs. 8(A) and 8(B)],
the spectral period f
p
of interferences is different for the two
ears, which has two consequences. First, all spectral notches at
each ear are seen in the ILD. For instance, in the example shown
in Figs. 8(A) and 8(B) [aswellasFig.6(A)], because the wall
TABLE III. Computed values derived from Eq. (5) and prediction for the first notch and the interference spectral period frequency [see Eq. (10)].
Wall reflection Ground reflection
Left ear Right ear Left ear Right ear
First notch Computed value r¼10
5
3357 Hz 1375 Hz 1011 Hz 1011 Hz
f
0
Computed value r¼610
5
4428 Hz 1871 Hz 1216 Hz 1179 Hz
Estimate (r¼1) 6105 Hz 2675 Hz 1352 Hz 1376 Hz
Spectral Period Computed value r¼10
5
10 240 Hz 4876 Hz 2626 Hz 2695 Hz
f
p
Computed value r¼610
5
11 378 Hz 5120 Hz 2695 Hz 2768 Hz
Estimate (r¼1) 12 211 Hz 5351 Hz 2703 Hz 2753 Hz
FIG. 7. The orthodromic distance OD between the point of incidence of a
source S with the surface of a sphere (incidence angles ðhS;uSÞ) and the ear
(coordinates ðhM;uMÞ). See Appendix Dfor the formulas.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 15
Author's complimentary copy
faces the left ear, the variations of acoustical pressure vs fre-
quency are stronger for the left ear. Thus, the first maximum in
ILD [Fig. 8(A)] stems from the right ear (1871 Hz), though the
strongest one comes from the left ear (4428 Hz). The two inter-
ference spectral periods from each ear are present in the ILD in-
terference pattern—note the irregular oscillatory pattern in Fig.
8(A). Second, the interference patterns for both ILD and ITD
are much stronger for the wall reflection than for the ground
reflection. Indeed, for a reflection at the ground, the monaural in-
terference patterns are similar for the two ears, with almost iden-
tical values across ears for both f
0
and f
p
(Table III). This is
because the direct and reflected sound have the same lateral
angle, so that /S
RðfÞ/S
RðfÞ. As a consequence, the ILD is
much less affected than with a reflection at a wall.
With weakly reflective textures such as snow, interfer-
ences in ITDs and ILDs are not visible with a ground reflec-
tion. They are present, although reduced, with a wall
reflection. With strongly reflective textures such as asphalt,
the magnitude of interferences is larger compared to the r
values used here.
Monaural spectral notches correspond to dramatic
changes in ITD. For this reason, the same interference pat-
tern is expected in the ITD vs frequency curve as in the ILD
vs frequency curves. However, while the spectral periodicity
of these patterns agree well, the precise frequencies of
extrema in ITDs can be more difficult to predict: for a reflec-
tion at the ground, monaural interferences are similar in both
ears and the abrupt changes in ITDs match the extrema in
ILDs [see Fig. 8(D)], but for a reflection at a wall, the de-
structive interferences may appear at different frequencies at
the two ears, and the resulting binaural interference pattern
is more complex. In general, the extrema in ITDs and ILDs
are interlaced, and extrema in ITDs are closer to extrema in
ILDs when the reflecting surface is harder. For instance, in
Fig. 8(B), the first peak in ITD is almost in the middle of two
ILD extrema for r¼105(gray solid line) while it is close to
the first peak in ILD for r¼6105(black solid line).
We may wonder whether these interference patterns
overlap with the hearing range of various species. In
Table IV, we report the computed values f0¼1=ð2ðDþ
D/RÞÞ as an estimate for the first extrema in ILD=ITD after
ground reflection for the animals in Table I. The interference
spectral period fpwould be close to twice these values. This
estimate corresponds to an acoustically hard surface—
remember that the calculated values will be slightly lower
for an acoustically softer surface which introduces an addi-
tional delay in low frequencies. For these calculations, ear
position and head radius were estimated from photographs,
and for simplicity assuming a spherical head (many animals
do not have. This has only a limited influence on final val-
ues). For tall mammals such as humans, Table IV shows that
ITDs and ILDs can be disturbed at low frequencies. How-
ever, it could be argued that the law of the first wavefront,
which occurs for such delays, will limit this impact by sup-
pressing the reflection. We return to this issue in the discus-
sion. For small mammals (e.g., cats, gerbils, guinea pigs),
FIG. 8. Panels A and B show the ILD and ITD estimated from HRTFs of the sphere model after reflection by a wall. Panels C and D show these ILD and ITD
after reflection by the ground. The parameters of the source following the conventions of Fig. 1are indicated at the top. The parameters of wall- and ground-
reflected waves are shown within plots A and C, respectively.
16 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
the interference spectral period f
p
is generally large for sour-
ces at a polar angle of 0, unless they are very close, and
therefore only the first extremum f
0
is expected to impact
sound localization. However, for a fixed source at distance
1.5 m and height 1 m (two last rows), giving a high polar
angle for small mammals, both f
0
and f
p
fall within the physi-
ologically relevant frequency range for these animals. In this
case, interferences in ITD and ILD may impact sound local-
ization. At first sight, it would seem that this impact is nega-
tive because it degrades the “normal” binaural cues.
However, as is outlined in the next section, if these interfer-
ences can be systematically related to the source location, in
particular distance, then they may also provide a usable cue
for sound localization.
IV. INTERFERENCES AS LOCALIZATION CUES
The interference pattern is determined by the delay
between the direct and reflected sounds, which depends on
the location of the source. In Fig. 9, we show the computed
change in ILD and ITD produced by the reflection (see
Appendix E), as a function of frequency and either polar
angle or lateral angle, for a reflecting surface with a flow re-
sistivity of 6 10
5
(sand).
The interferences in the ITD vs frequency curves follow
the same pattern as in the ILD vs frequency curves. Similar
interference patterns as a function of polar angle have been
obtained from real recordings of ILDs in gerbils after reflec-
tion by a plywood floor (Maki et al., 2003). It can be seen
TABLE IV. Approximation of spectral period and first extrema frequency of interferences in sound waves incoming at the right ear in a ground reflection
model for the various species of Table I. The distance from the source is assumed to be 1.5 m and the same two sets of source height=polar angle as in Table I
are used. The lateral angle is assumed to be 40.
Species Human
Dog
(Labrador)
Dog
(Bulldog) Cat Gerbil
Guinea
Pig Mouse
Head radius (cm) 8.75 8 9 4.5 1.3 2 0.8
Angles hM;uM() 100,5 110,30 110,30 110,40 110,40 110,30 110,40
Minimum f
0
(Hz) (across all sources considered in Sec. V, Fig. 12) 49 111 208 417 685 1042 4167
dist.: 1.5 m elev.: 0(source
height ¼ear-ground distance)
First extremum f
0
(Hz) 76 260 761 2831 8352 17 700 253 000
Interference period f
p
(Hz) 152 520 1522 5662 16 700 35 400 506 000
dist.: 1.5 m source height ¼1m
First extremum f
0
(Hz) 107 205 342 659 1182 1685 6045
Interference period f
p
(Hz) 214 410 684 1318 2364 3370 12 000
FIG. 9. The ITD and ILD changes produced by a reflection (A,B: ground; C,D: wall). In panels A and B, the lateral angle is set to 40for the ground reflec-
tion, as in Fig. 8. In panels C and D, the polar angle is set to 40for the wall reflection.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 17
Author's complimentary copy
that there is a systematic variation in both the first interfer-
ence frequency f
0
and the interference spectral period f
p
with
polar angle for ground reflection and lateral angle for wall
reflections. These two values might thus be usable as local-
ization cues. However, for a reflection at a wall, they also
depend on the location and orientation of the reflecting sur-
face, which are very variable in real environments (see
Appendix B). The unpredictable variability in binaural cues
may therefore be seen as “noise” but for a reflection at the
ground, this variability is predictable, because the ground is
generally approximately horizontal and at a fixed distance
from the ears. Indeed, if the polar angle is known, then the
distance from the source can be estimated from the first in-
terference frequency f
0
or the interference spectral period f
p
[Fig. 10(A)]. As we have seen, these two values are not very
sensitive to the nature of the ground. Conversely, if the dis-
tance is known, then the source polar angle can be estimated
[Fig. 10(B)]. Nevertheless, the interference spectral period
would probably not be a very helpful localization cue for
large distances, because very few interference peaks occur
within the physiological frequency region, as the second
peak occurs at frequency f
0
þf
p
3f
0
. However, the first in-
terference frequency would remain useful over a larger dis-
tance range.
It is also useful to look at the change of the values of
ITD and ILD relative to when there is no reflection. In
Fig. 11, we show this relative change for the situation shown
in Figs. 6and 8. It appears that ILD is more affected than
ITD, both for ground and wall reflections. Even for the mod-
erate flow resistivity value rof 6 10
5
used for the calcula-
tions in this figure, the variations in ILDs are very large in
both cases, especially for low frequencies. Indeed, many
surfaces are very reflecting in low frequency and the ILD is
very small in the absence of reflections. Both binaural cues
are much less affected by the ground than by the wall.
Although the position of the ITD and ILD extrema depends
on the delay between the direct and reflected sounds and
thus may be viewed as information, the amplitude of the
changes depends on surface type: the amplitude is higher for
more rigid surfaces. In general, then, these may be seen as
FIG. 10. Panel A shows, for the ground reflection, the relationship between source-head distance (vertical axis) and interference pattern spacing and first
extrema (horizontal axis), for a polar angle of 0. Panel B shows, for the ground reflection, the relationship between polar angle and interference pattern spac-
ing and first extrema, for a source-head distance of 1.5 m. The source is assumed to be at the same level as the animal and its lateral angle is 40.
FIG. 11. The relative change in ITD and ILD due to reflections, compared to ITD and ILD of the direct wave, for the source parameters used in Figs. 6and
10. The sound wave is reflected by (A) the ground or (B) a wall. The flow resistivity is r¼610
5
.
18 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
degradation of binaural cues, and it appears that ITD is the
more reliable.
V. RANGES OF ITD AND ILD WITH REFLECTIONS
We now look at how reflections change the range of
ITDs and ILDs processed by the auditory system. We calcu-
lated the maximum ITD and ILD as a function of frequency
across lateral angle and polar angle coordinates within a
large grid of 393 spherical positions evenly distributed
around the sphere (between –45and 90polar angle), simi-
lar to that in Behrend et al. (2004), and for distances between
1 and 20 m at increments of 50 cm. Figure 12 shows the
results for ground and wall reflections. In both cases, the
range of ILDs is greatly extended, especially in low frequen-
cies where ILDs are usually small. Without reflections, ILDs
are always smaller than about 10 dB. With a reflection at a
wall, they can reach more than 30 dB. This phenomenon is
accentuated by surfaces with higher flow resistivities.
The range of ITDs is not strongly affected by a reflec-
tion at the ground, but it is very much extended with wall
reflections, as was noted by McFadden (1973). Without
reflections, the ITD changes continuously with frequency
and therefore it is usually estimated by “unwrapping” the
IPD, that is, by considering that ITD is a function of fre-
quency that is consistent with IPD modulo 2pand has mini-
mum variation: j2pf2ITDðf2Þ2pf1ITDðf1Þj <pfor two
neighboring frequencies f
1
and f
2
. This does not work with
reflections because destructive interferences make ITD a
discontinuous function of frequency. Therefore, for Fig. 12,
we chose a conservative estimate, by choosing the ITD con-
sistent with IPD that is closest to the anechoic ITD (see
details in Appendix E). Thus the range of ITDs with reflec-
tions shown in Fig. 12 is an underestimation. For high fre-
quencies, this underestimation is not very informative
because the method artificially constrains the estimate to be
within 1=(2f) of the anechoic ITD. This is known as the
p-limit (Brand et al., 2002;Hancock and Delgutte, 2004;
Joris and Yin, 2007;McAlpine et al., 2001). However, for
low frequencies and reflections on a wall, ITDs can be arbi-
trarily large within this limit, especially for surfaces with
high flow resistivities. Indeed, as we have seen, the IPD is
close to pnear a destructive interference. This means that
the range of ITDs is very much extended by reflections at
low frequencies compared to an anechoic environment.
VI. DISCUSSION AND SUMMARY
A. Limitations of the models
Although we have tried to consider realistic acoustic
properties for the reflecting surfaces, our models rely on a
number of approximations.
First, we did not consider the frequency-dependent
absorption properties of air (ISO 9613-1:1993, 1993). In par-
ticular, high frequencies tend to be more attenuated than
lower frequencies and the effect depends on distance. How-
ever, as we only considered early reflections, the spectrum
of the direct and reflected sounds should be almost identical,
FIG. 12. The panels A and B show the maximum ILD and ITD with and without ground reflections, for sounds between 1 and 20 m from the head and with a
ground-head distance of 0.08 m. The panels C and D show these maximum ILD and ITD with and without wall reflections. The p-limit and the ITD corre-
sponding to the distance between the two ears (3.8 cm in the sphere model) are indicated in dashed lines in B and D.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 19
Author's complimentary copy
and therefore the differential effects of air absorption should
be minor.
Second, we only considered a single reflection. In a real
environment, there are many more reflections. In addition,
under specific atmospheric conditions, a downward refraction
by the atmosphere might generate additional ray paths and
therefore one or more reflections to the ground (Sutherland
and Daigle, 1998). Late reflections are assumed to be sup-
pressed by the auditory system, via the precedence effect, but
there may be additional early reflections. However, at least
for low frequencies, most of the reflected power should be
contributed by large flat and thick surfaces such as the ground.
But a natural ground is not a perfectly horizontal surface. In
particular, small irregularities such as rock or stones may
heavily deviate and attenuate the reflection of high frequency
sounds. For instance, an obstacle of less than 5 cm interferes
with sound waves above 7 kHz. As a consequence, interfer-
ences at such high frequencies might be of lower amplitude in
real environments than in our model. We also assumed that
the texture is homogeneous and of infinite depth, which is
clearly not the case in natural environments. Nonetheless,
many studies have shown that field measurements of sound
propagation are in close agreement with the theoretical mod-
els used in this paper, in particular for grass which has an av-
erage flow resistivity similar to the values we used (Chessell,
1977;Embleton et al., 1983;Rasmussen, 1981).
Thirdly, the sphere is a highly simplified model of the
head of an animal. We introduced shifted ears but other
aspects such as the nose and the torso also play a role in the
diffraction of sounds by the body.
Finally, we only considered point sources with omnidir-
ectional directivity (i.e., monopoles). Real sources may devi-
ate from this model, for example human speech or mammal
vocalizations are not omnidirectional. For these sources, we
would expect that the direct sound has more power than the
reflected one.
B. Impact of reflections on binaural cues
For many small animals such as cats and guinea pigs,
the ground contributes early reflections with delays no lon-
ger than about a millisecond. This maximum delay is only
for a sound source near the ears and it quickly decreases
with distance: for example, it is just 150 ls for a sound
source at 1.5 m from the head of a cat. It is unlikely that
such reflections can be suppressed by the auditory system,
considering that psychophysical measurements indicate that
the threshold for fusion is close to 1 ms. Therefore, the
reflection modifies the binaural cues perceived by the ani-
mal, and so the perceived binaural cues in an ecological
environment, even a simple one with only a ground, are not
the same as in an anechoic environment.
In barn owls, a number of studies have addressed the
neural and behavioral correlates of acoustical reflections
(Keller and Takahashi, 1996a,1996b,2005;Nelson and
Takahashi, 2010;Spitzer and Takahashi, 2006). In a natural
environment, the ground can be far from the owl’s ears when
it flies, but when it is about to catch its prey the ground is
very close. Therefore the same remarks as for small animals
apply, and the maximum delay of the reflected sound is a
fraction of millisecond. For example, for a mouse just below
the owl, the delay is about 100 ls, which corresponds to an
interference at frequency 5 kHz, right in the middle of the
hearing range of these animals. This is well below the fusion
threshold for owls, about 0.5 ms (Keller and Takahashi,
1996a,1996b). In binaural neurons of the inferior colliculus,
the response to a reflection is suppressed when the delay is
longer than 0.5 ms, but it is consistent with cross-correlation
of the summed direct and reflected signals below 0.5 ms
(Keller and Takahashi, 1996b). This suggests that the
reflected signal is indeed retained by the auditory system for
such short delays. These electrophysiological observations
are consistent with behavioral measurements, in which owls
turn their head to the leading sound when the delay is longer
than 1 ms (Keller and Takahashi, 1996a).
The combination of the direct and reflected sounds
leads to monaural interferences, around frequencies
f¼1=(2D)þn=D, which are then seen in binaural cues. Peri-
odic distortions in ILDs have been previously reported in
studies of simulated gerbil HRTFs with ground reflection
(Grace et al., 2008) and in human HRTFs with ground reflec-
tion (Rakerd and Hartmann, 1985) or for the ear facing a very
near wall (Shinn-Cunningham et al.,2005). The largest modi-
fications occur for reflections at a vertical wall, because the
lateral angle and therefore the binaural cues are very different
for the two sounds. For a reflection at the ground, the modifi-
cations are smaller but still very significant near interference
frequencies. ILDs are typically more affected than ITDs,
because ILDs are defined as a ratio of monaural levels.
Indeed, even though a constructive interference cannot pro-
duce a gain larger than 6 dB, the change in ILD can reach
10–15 dB because of destructive interferences.
C. Humans vs small animals
We have focused our study on small animals that live on
the ground. In these animals, delays between direct and
reflected sounds are very short in many situations. For
humans, the delays of reflected sounds can be longer, up to
10 ms for a reflection at the ground. It might be argued that
this is beyond the fusion threshold and therefore such echoes
should be suppressed by the auditory system. However, there
are reasons to think that our results may also apply to
humans. First, the delays are shorter when the source is far
[the delay is <1 ms when the source is >15 m away, see
Fig. 3(A)], and therefore interferences in binaural cues
should be seen for distant sources. Second, even when the
delay is long, the processing of binaural cues by the auditory
system might still be impacted, because they are processed
in frequency bands, where the direct and reflected sounds
may interact. The duration of the impulse response of an au-
ditory filter is inversely related to its bandwidth and there-
fore with its center frequency, and two delayed impulse
responses interact if the delay is shorter than a few periods.
The first interference occurs at frequency f¼1=(2D): for
that frequency, the delay corresponds to only half a period of
the waveform. Therefore, this interference should be seen in
the response of binaural neurons, at least in the earliest
20 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
stages of the binaural pathway. This means that, in the
responses of these neurons, the direct and reflected sounds
should merge when the delay is smaller than a value which
is inversely proportional to frequency. This is consistent
with psychophysical measurements in humans (Dizon and
Colburn, 2006;Kirikae et al., 1971).
Thus it seems plausible that early reflections also inter-
fere with direct sounds in humans and affect binaural cues, so
that all the results shown here should also apply to humans.
The main difference with small animals is that humans stand,
and therefore the ground is further from the ears. This implies
that delays between direct and ground reflected sounds are
typically longer for sources in the horizontal plane, even at
relatively far distances. Longer delays mean lower interfer-
ence frequencies. For example, with a 10 ms delay, the first
interference frequency is 50 Hz and the next ones are at 150
and 250 Hz; with a 6.5 ms delay, i.e., a distance of 1.5 m,
they are 77, 231, and 385 Hz. These are in the hearing range
of humans. In addition, as we have seen in Sec. III B, natural
surfaces are generally very reflecting at low frequencies.
Moreover, humans typically stand on artificial textures such
as concrete or asphalt that are strongly reflective. Therefore,
we expect strong modifications of low frequency binaural
cues for sources at the same height as the listener. Finally, we
note that, as for owls, the delays are short if the source,
instead of the ears of the listener, is close to the ground. In
this case, all the results we have shown directly apply.
D. Noise and information contributed by early
reflections
The interferences between direct and reflected sounds
modify the binaural cues, especially ILDs. These modifica-
tions could be seen as a source of noise in the localization of
sound sources. The relevant issue is the variability of these
cues for a given location, when other unknown factors are
allowed to vary, for example the nature of the ground or the
exact orientation of the reflecting surface. With this point of
view, the changes introduced by walls or similar obstacles
would qualify as noise, because they are large and very sen-
sitive to other parameters that seem difficult to precisely esti-
mate by other means, such as the orientation and texture of
the obstacle. Indeed, performance in localization is degraded
by strong reflections (Croghan and Grantham, 2010;Gigue`re
and Abel, 1993;Rakerd and Hartmann, 2005) and vision is
given a stronger perceptual weight (Truax, 1999).
How can the auditory system deal with these disturban-
ces? For broadband sounds, frequency integration might be a
useful strategy: ITDs and ILDs vary with respect to frequency
around an average value which could be used to estimate the
source location, as seen in Fig. 8. In echoic environments,
human performance is indeed poor for pure tones (Hartmann,
1983;Rakerd and Hartmann, 1985) and transients seem to be
important for localization in rooms (Hartmann and Rakerd,
1989;Rakerd and Hartmann, 1985). Another possible strategy
for continuous sounds is to use the motion of the source or the
voluntary motion of the head, and to select the most favorable
configurations—for example, on the basis of variability of
binaural cues across frequency. However, we note that the
detectability of a masked signal is not generally improved by
its motion (Xiao and Grantham, 1997). A third hypothesis is
that the auditory system may select the most “plausible” bin-
aural cue: for example, very large values for ITDs could be
seen as implausible and discarded, giving a stronger weight
for ILD (Rakerd and Hartmann, 1985). We found that ITDs
were generally less affected by reflections than ILDs. This
would suggest that the auditory system should rely more on
ITDs than on ILDs. However, for high frequencies, ITD is
ambiguous unless the sound is broadband. Even though high
frequency neurons in the inferior colliculus are sensitive to
envelope ITDs (Griffin et al.,2005;Joris, 2003;Nelson and
Takahashi, 2010), a recent study of the directional sensitivity
of such neurons in reverberation suggests that ILD provides
better directional information than envelope ITDs in high fre-
quencies (Devore and Delgutte, 2010).
Binaural cues are less affected by reflections at a ground
than at a wall, especially ITDs. More importantly, even
though binaural cues differ from the anechoic case, they are
not very variable: the ground is generally horizontal, the dis-
tance between the ears and the ground is fixed (or at least
likely to be known by the animal) and the influence of the na-
ture of the ground is relatively small. As we have seen, the in-
terference frequencies are directly related to the delay
between the direct and reflected sounds, and therefore to the
polar angle and distance. The amplitude of these interferences
depends on the nature of the ground, but their frequency does
not. Therefore, interferences contributed by the reflection at
the ground are a potential spatial cue. It is known that rever-
beration contributes to the perception of distance and spa-
ciousness (e.g., Blauert, 1997;Truax, 1999)—in particular,
the reverberation time. However, the role of single reflections
has not been fully described. Preliminary results show that
spatial maps of some neurons are unexpectedly more accurate
with a reflection than in free-field, at least in the external
nuclei of the inferior colliculus of the gerbil (Maki et al.,
2005). We suggest that interferences in binaural cues might
provide a cue to distance and=or polar angle.
The main cues for polar angle are thought to be (1) mon-
aural spectral notches introduced by the direction-specific
attenuation of particular frequencies by the pinna (Algazi
et al., 2001;Blauert, 1997;Musicant and Butler, 1985;
Tollin and Yin, 2003;Wightman and Kistler, 1992) and (2)
head movements (Blauert, 1997;Thurlow et al., 1967).
Spectral notches occur at high frequencies and their fre-
quency is positively correlated with polar angle (Maki and
Furukawa, 2005;Tollin and Yin, 2003) while the inverse
correlation is seen for the interference frequencies [see Fig.
10(B)]. Therefore, these two cues to polar angle should be
essentially independent. The potential role of interferences
in polar angle estimation was noticed in a similar study on
simulated HRTFs of gerbils with ground reflection (Grace
et al., 2008), and is also in agreement with a psychoacousti-
cal study showing that the polar angle of the sound source
was estimated with greater accuracy with a sound-reflecting
surface on the floor (Guski, 1990). This effect was not seen
with a wall reflection, which suggests that the auditory sys-
tem indeed relies on the knowledge of the head-ground dis-
tance to extract the relevant information.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 21
Author's complimentary copy
However, interference frequency provides an ambiguous
cue to polar angle because it also depends on distance. Dis-
tance perception is generally seen as less accurate than for
lateral angle and polar angle in mammals (Bronkhorst and
Houtgast, 1999;Zahorik et al.,2005). Several cues have been
previously investigated [see reviews in Coleman (1963),Mer-
shon and Bowers (1979) and Brown and May (2005)]: (1)
level of known sounds: the sound intensity varies with dis-
tance according to the inverse-square law; (2) frequency
spectrum: high-frequency components of broadband signals
are attenuated more rapidly by air propagation than low-
frequency ones; (3) movement parallax: the direction of a
source is less modified by listener movements for a distant
source than for a close source; (4) acoustic field width: it
should be larger for a close source; (5) direct-to-reverberant
energy ratio: at long distance, the many reflections induced
by the propagation of sounds in all directions increase the
amount of energy received after the direct sound wave
(Bronkhorst and Houtgast, 1999;Mershon and King, 1975;
Shinn-Cunningham et al.,2001;Yan-Chen Lu and Cooke,
2010;Zahorik et al., 2005). Many studies on distance percep-
tion suggest that auditory image distortion, including reflec-
tions and scattering, by natural environments is an important
cue for distance perception (Brown and Gomez, 1992;Brown
and Waser, 1988;Waser and Brown, 1986;Wiley and Rich-
ards, 1978). For instance, it has been shown that distance
judgments are more accurate in a reverberant space than in an
anechoic space (Mershon and Bowers, 1979;Mershon and
King, 1975;Nielsen, 1993;Sheeline, 1982;Zahorik, 2002).
In addition to these cues, our results suggest that binaural
interferences might be another one. Nevertheless, for sources
in the horizontal plane, which is probably the most common
situation, interferences occur at high frequencies (>20 kHz)
at distances greather than 2 m for guinea pigs [Fig. 10(A)],
meaning that only a few interference peaks will fall within
the hearing range of the animal. For these species, this cue
might therefore be more useful for close sources. For humans,
these interferences could provide information over a larger
range (for example, the first interference frequency is 76 Hz
at 1.5 m, see Table IV). This hypothesis could be tested with
psychophysical experiments.
These interferences are also seen in monaural signals,
and therefore they could be seen as monaural cues. However,
only the binaural cues are independent of the sound source:
for example, frequency-dependent changes in level in a mon-
aural signal can be due either to reflections or to the spec-
trum of the source. If binaural rather than monaural
interference cues are used by the auditory system, one pre-
diction is that their effect should only be seen away from the
median plane, where ITDs and ILDs are essentially zero.
Whether early reflections come from the ground or from
a wall, and even though their impact on binaural cues may
depend on many factors, these binaural cues are reproducible
and temporally stable. Thus, even if ITDs and ILDs cannot
be unambiguously mapped to the location of the sound
source, they could still be used as reliable spatial cues to iso-
late a sound source from a noisy background. Perhaps a per-
son or animal could learn to associate a source with a
particular pattern of frequency-dependent ITD and ILD,
which could then be used to isolate its signal from those of
competing sources, even if this binaural pattern cannot be
accurately associated with a particular spatial location. This
suggestion has an important implication: the large ITDs and
ILDs due to early reflections are not simply “noise” to be fil-
tered, but instead are naturally occurring cues that may be
encoded by the auditory system.
E. The natural distribution of binaural cues
When early reflections are considered, the range of ITDs
and ILDs is extended, compared to the anechoic case. This
observation is most interesting for ITDs. In an anechoic envi-
ronment, the ITD is limited by the size of the head. For exam-
ple, in humans, it does not exceed 650–700 ls in high
frequencies. In low frequencies, it can be about 50% larger
because of diffraction effects (Kuhn, 1977), but it is still lim-
ited. However, with an early reflection, we have seen that a
discontinuity in IPD occurs at the interference frequency, and
this implies that the IPD can be arbitrarily large. This means
that the ITD can take any value within the p-limit, i.e.,
between 1=(2f)and1=(2f), where fis the frequency. In
many species, many binaural neurons are tuned to best delays
that are greater than the maximum range of ITDs in an
anechoic environment (McFadden, 1973). The proportion of
such neurons differ between species but it has been consis-
tently observed in rabbit (Kuwada et al., 1987), guinea pig
(McAlpine et al.,2001), cat (Hancock and Delgutte, 2004;
Kuwada and Yin, 1983;Yin and Chan, 1990), gerbil (Brand
et al.,2002), chinchilla (Thornton et al., 2009), kangaroo rat
(Crow et al.,1978), chicken (Ko¨ppl and Carr, 2008), and
barn owl (Wagner et al.,2007). In most of these studies, this
proportion is overestimated because the best delays are com-
pared to the maximum ITD measured in high frequencies,
which is smaller than the actual range incorporating the dif-
fraction effects in low frequency (compare for example
McAlpine, 2005 and Sterbing et al.,2003). However, it
remains that a significant number of binaural neurons are
tuned to ITDs that lie outside the range of anechoic ITDs.
This observation has motivated a new theory of ITD process-
ing in mammals, according to which ITD is represented by
the relative activity of two populations with symmetrical best
delays lying outside the natural range (Grothe et al., 2010), a
strategy sometimes referred to as slope coding or the two-
channel model. This theory is in contrast with the “peak
coding” theory (Carr and Konishi, 1990), where ITD is repre-
sented by the best delay of the maximally activated neuron.
Note that other coding strategies are possible (Colburn,
1973). Our results imply that the natural range of ITDs is
much larger than expected from anechoic measurements
when considering reflections on the ground or on obstacles: it
can take any value within the p-limit. Therefore, the large
best delays observed in binaural neurons of small mammals
are consistent with peak coding, and more importantly they
make the two-channel model problematic because the ratio of
activities in the two channels is an ambiguous representation
of ITD when the best delay lies within the natural range of
ITDs, i.e., different ITDs give the same ratio. It could be
argued that large ITDs due to reflections are disturbances and
22 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
therefore there is no reason for the auditory system to encode
them. However, as we have seen, for a reflection at the
ground, these large ITDs due to reflections contribute infor-
mation about sound location rather than noise, because they
are reproducible cues. For reflections on walls or obstacles,
even though these ITDs may not contribute information about
sound location, they are still temporally stable and therefore
potentially convey spatial information for segregating sound
sources. Thus, it does not seem implausible that the auditory
system may encode these large ITDs.
In addition, we also note that binaural neurons compare
the information only after the physical signals have been
processed by the auditory periphery. As has been noted by
other authors, such nonlinear processes, including for
instance half-wave rectification and adaptation, can cause
monaural interactions between direct and reflected sounds
that can result in unexpected changes in the cues effectively
seen by the binaural neurons (Hartung and Trahiotis, 2001;
Trahiotis and Hartung, 2002).
F. Summary
In realistic auditory environments, binaural cues can be
modified by reflections. When the delay between the direct and
reflected sounds is long, the auditory system can isolate the
onset of the direct sound. However, in many cases, these delays
are very short. For example, for a reflection at the ground, the
delay of the reflected sound is no more than 2p=c,wherepis
the distance of the ears from the ground and cis the speed of
sound. This gives about 10 ms for humans, about 1 ms for cats
and less for smaller mammals. In many practical situations, the
delay is substantially lower than this higher limit.
This delay Dresults in destructive interferences at
each ear at frequencies about f¼1=(2D)þn=D,wherenis
an integer, which produce large modifications of ITDs and
ILDs near these frequencies. Therefore, binaural cues in an
ecological environment, even a simple one with only a
ground, are not the same as in an anechoic environment.
These modifications are larger for ILDs than for ITDs.
They are larger for a vertical wall than for a horizontal
ground, because the interaural axis is parallel to the
ground. In all cases, they remain very significant near inter-
ference frequencies. These modifications depend on the
delay of the reflected sound, and therefore on source dis-
tance. At a finer level of detail, they also depend on the
acoustical properties of the reflecting surface. Hard surfa-
ces (e.g., concrete) reflect more energy than soft ones (e.g.,
snow) and therefore have a stronger impact on binaural
cues, but the impact is significant with typical natural
surfaces. The analyses also imply that the range of ITDs
and ILDs in natural environments is significantly extended
compared to the anechoic case.
As a final remark, we note that the ears of small mam-
mals are very close to the ground and possibly to objects on
the ground, making their acoustical environment more vari-
able. Thus, their acoustical cues for sound localization may
be quite different from those available to humans, with their
ears about 1.70 m above the ground. These differences
should be kept in mind when extrapolating the results of
animal studies to humans, or to other species: binaural cues
do not only depend on the shape of the head and ears, but
also on the properties of the natural acoustical environment.
ACKNOWLEDGMENTS
This work was supported by the European Research
Council (Grant No. ERC StG 240132). We thank the
reviewers for their constructive suggestions. We also thank
Marc Re´billat for comments on the manuscript.
APPENDIX A: LIST OF SYMBOLS
uSIncidence angle of the reflected wave on the wall
arg Argument of a complex number
cSpeed of sound (343 m=s here)
dDistance source-ears (direct path)
d* Distance source-ears (reflection path)
dAbsolute delay of the direct sound
d* Absolute delay of the reflected sound
DDelay direct sound—reflection (airhead model)
D
L
=D
R
D at the left=right ear, respectively
D/RDelay at the right ear between direct and reflected waves due to
sound diffraction (s)
erfc Complementary error function
f
0
First notch frequency in a spectrum (Hz)
f
p
Interference pattern spacing in a spectrum (Hz)
/Phase shift
u
M
Polar angle for the left ear
u
S
Polar angle of a point source S
F“Ground wave” function
HHead related transfer function (HRTF)
H
rec
HRTF after reflection
H
L
=H
R
HRTF at the left=right ear, respectively
hmm-th order spherical Hankel function
iImaginary number: square root of 1
kiWave numbers of the sound field in the i-th media
lInteraural distance
lNormalized frequency
nInteger
OCenter of the head
OD Orthodromic distance
pDistance head-ground or head-wall)
PComplex pressure
P
rec
Complex pressure at the receiver
P
m
m-th order Legendre polynomial
QSpherical reflection factor
rRadius of sphere
RPlane wave reflection coefficient
qNormalized distance to the source
SPoint source
S0Mirror source
r“Effective” flow resistivity (Pa.s.m-2)
S
L
=S
R
Signal at the left=right ear, respectively
sPhase delay of Q
h
M
Lateral angle for the left ear
h
S
Lateral angle of a point source S
(X,Y,Z) Cartesian coordinate system
x
S
,y
S
,z
S
Cartesian coordinates of S
wNumerical distance
ZiSpecific acoustic impedances of the i-th media
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 23
Author's complimentary copy
APPENDIX B: GEOMETRICAL MODELS OF
REFLECTION
We consider the case where a sound wave is reflected at
an obstacle before reaching its target. We consider two situa-
tions: reflections on a horizontal ground or on a vertical
wall, parallel to the median plane (Fig. 2). We give here the
detailed calculations for the delay between direct and
reflected sounds as a function of the distance from the source
to the head and the polar (ground case) or lateral (wall case)
angle of the sound source to the head. Following the notation
used in Fig. 2, the angles of the reflection relative to the head
are given by the following equations, in the case of a reflect-
ing ground:
hS¼hS;(B1)
uS¼arctan dcosðuSÞ
2pþdsinðuSÞ

p
2;(B2)
d¼dcosðuSÞ
cosðuSÞ:(B3)
In the case of a reflecting wall, both lateral and polar angles
are modified by the reflection. The easiest way to compute
angles for S* is to consider the Cartesian coordinates system
(X,Y,Z): in that, S* is a translation of Salong the Yaxis (that
of the two ears, which is perpendicular to the wall and there-
fore parallel to SS*). We have
xS¼dcosuScoshS¼xS¼dcosuScoshS;
yS¼dcosuSsinhS¼2pyS¼2pdcosuSsinhS;
zS¼dsinuS¼zS¼dsinuS;
(B4)
which gives
hS¼arctan 2pdcosðuSÞsinðhSÞ
dcosðuSÞcosðhSÞ

;(B5)
d¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2
Sþy2
Sþz2
S
q
¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d22pdcosðuSÞsinðhSÞþ4p2
p;(B6)
uS¼arcsin dsinðuSÞ
d

:(B7)
These results are used in the geometrical models of reflec-
tions described in Sec. II.
APPENDIX C: ACOUSTICAL MODEL OF
REFLECTIONS ON NATURAL SURFACES
Here, we give details of a model for the modifications of
a sound wave when it is reflected by a natural surface in a re-
alistic outdoor environment. In the following, we describe
the ground-reflection model. The equations are identical for
the wall-reflection model, with uSreplaced by the incidence
angle a
S
*
of the reflected wave on the wall. Note that as the
wall being parallel to the (X,Z) median plane (see Appendix B),
then if Ois the center of the head, a
S
*
is equal to the angle
between OS* and its projection on the median plane (X,Z).
Thus,
aS¼arccos ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2
Sþz2
S
qd
0
@1
A¼arccos ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d2y2
S
qd
0
@1
A
¼arccosðffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1cos2ðuSÞsin2ðhSÞ
qÞ:
(C1)
As explained in Sec. II B, we use the Weyl–Van der Pol so-
lution (Sutherland and Daigle, 1998) for the boundary condi-
tions of a spherical sound wave reflected on a plane. The
complex sound field P
rec
at the receiver is well approximated
by the equation
Prec ¼Pðd;fÞþQðd;f;uSÞPðd;fÞ;(C2)
where Qis the spherical reflection factor and P(d,f) is the
sound field amplitude at frequency fand distance dfrom the
source in the absence of reflecting surfaces. In the following,
all quantities except angles implicitly depend on frequency.
Qcan be written as Q¼Rþ(1 R)F(w), where Ris the
plane wave reflection factor and ð1RÞFðwÞis a boundary
correction. The plane wave reflection coefficient Ris (Ches-
sell, 1977;Embleton et al., 1983)
R¼
sinuSZ1
Z2
1k2
1
k2
2
cos2uS

0:5
sinuSþZ1
Z2
1k2
1
k2
2
cos2uS

0:5:(C3)
Z1and Z2are the specific acoustic impedances of the air and
ground surface, respectively. FðwÞ, also known as the
“ground wave” function or as the “boundary loss factor,” is
equal to
FðwÞ¼1þiffiffiffiffiffiffi
pw
pewerfcðiffiffiffi
w
pÞ;(C4)
where
w¼i4pfd
cð1RÞ2
Z1
Z2

2
1k2
1
k2
2
cos2uS

(C5)
is called the “numerical distance,” erfc is the complementary
error function, k1and k2are the wave numbers of the sound
field in the air and ground surface, respectively. FðwÞ
describes the interaction of the curved wavefront with a
ground of finite impedance. If the wavefront is plane
(d!1) then jwj!1and F!0 while if the surface is
acoustically hard, then jwj!0 and F!1 (also R¼1), so
Q¼1. Numerical computation of Fis unstable for high val-
ues of wand was performed using algorithms in (Weideman,
1994).
The acoustic impedance of a surface (such as Z1and Z2)
is the ratio of the amplitude of the sound pressure to the am-
plitude of the particle velocity of an acoustic wave that
impinges on the surface. Both concrete and densely packed
24 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
glass fiber are high impedance materials relative to air,
whereas grass and some foams are low impedance surfaces.
If a sound wave changes medium, the ratio of the acoustic
impedances of the media Z2=Z1determines the efficiency of
the energy transfer.
There have been numerous models estimating k2=k1and
Z2=Z1for typical outdoor surfaces. A widely used one, called
the Delany–Bazley model (Delany and Bazley, 1970),
involves a single parameter, the “effective” flow resistivity r
to characterize the ground. Its units are Pa s m
2
. Following
time conventions in Embleton et al. (1983) and improve-
ments of the Delany–Bazley model by Miki (1990) we have
the expressions
Z2
Z1¼1þ0:0699 f
r

0:632
þ0:1071if
r

0:632
;(C6)
k2
k1¼1þ0:1093 f
r

0:618
þ0:1597if
r

0:618
;(C7)
which were considered valid in the range 0:01 <f
r<1 origi-
nally but which remain well behaved in a larger frequency
range. This model may be used for a locally reacting ground
as well as an extended reaction surface.
Several tables for flow resistivity rhave been published
(see Cox and D’Antonio, 2009). They agree on values
around r¼2:104for snow, r¼105for grass fields or forest
floor, and around r¼6:105for sand or dirt, a roadside with
rocks less than 4 in. in size. rcan reach as much as 3 10
7
for asphalt or 2 10
9
for concrete. In this paper, we chose
r¼105and r¼6105as moderate values for rin order
to simulate credible outdoor environments encountered by
small mammals.
APPENDIX D: SPHERE MODEL
We describe here our model of HRTFs. We use a spheri-
cal head model with shifted ears, for which the diffraction
function is completely known and has been extensively
described, for instance, in (Duda and Martens, 1998;Ono
et al., 2008). Briefly, simulated HRTFs can be obtained from
the frequency-domain solution for the diffraction of an
acoustic wave by a rigid sphere modeling the head.
The source Sis at a distance dfrom a sphere of radius r.
Let hbe the angle of incidence between the ray from the center
of the sphere to the source and the ray to the measurement point
on the surface of the sphere. Given the symmetry axis of a
sphere, one angle is enough to define the incidence angle. The
transfer function at the surface of the sphere is then given by
Hðd;f;h;rÞ¼q
leilq X
1
m¼0ð2mþ1ÞPmðcoshÞhmðlqÞ
h0
mðlqÞ;
(D1)
where q¼d=r1;l¼ð2pr=cÞf;hmis the m-th-order spher-
ical Hankel function, h0
mits derivative, and P
m
is the m-th order
Legendre polynomial (Rabinowitz et al., 1993;Rayleigh and
Lodge, 1904).
For a spherical head model, rhis equal to the shortest
distance that the sound wave has to cover to reach the ear,
i.e., the shortest distance at the surface of the sphere between
an ear of coordinates ðhM;uMÞand the incidence angles
ðhS;uSÞof the sound wave. This is equal to the orthodromic
distance rODððhS;uSÞ;ðhM;uMÞÞ between these two
points (see Fig. 7) given by Deza and Deza (2006):
ODððhS;uSÞ;ðhM;uMÞÞ¼arccosðcosuScosuM
cosðhMhSÞþsinuSsinuMÞ
(D2)
(note that when ears are assumed to be antipodal,
i.e., ðhM;uMÞ¼ð90;0Þ, then ODððhS;uSÞ;ð90;0ÞÞ
¼arccosðcosuSsinhSÞ). Thus, the HRTF at the left ear can
be written HLðd;f;ODððhS;uSÞ;ðhM;uMÞÞ;rÞand that of
the right ear is HRðd;f;ODððhS;uSÞ;ðhM;uMÞÞ;rÞwith
the previous notations, assuming that ears are symmetrical
relative to the medial plane.
For computations made over all spherical positions, a
grid of 393 spherical positions evenly distributed around the
sphere (between –45 and 90polar angle) was used and is
the same as that found in Behrend et al. (2004).
APPENDIX E: ITD AND ILD ESTIMATES
The interaural transfer function (ITF) is typically
defined as the ratio of contralateral and ipsilateral HRTFs,
which we will adapt here to the ratio of left and right HRTFs
for clarity of equations all over the paper since the wall is
placed on the side of the left ear, i.e.,
ITFðf;d;ðhS;uSÞÞ ¼ HLðf;d;ðhS;uSÞÞ
HRðf;d;ðhS;uSÞÞ:(E1)
From the ITF, we derive the ILD and interaural phase differ-
ence (IPD) as follows:
ILDðf;d;ðhS;uSÞÞ ¼ 20 log10jITFðf;d;ðhS;uSÞÞj;(E2)
IPDðf;d;ðhS;uSÞÞ¼argITFðf;d;ðhS;uSÞÞ(E3)
(where the ILD is in dB). To estimate the ITD of direct and
reflected sound waves, we chose a conservative estimate, by
choosing the ITD consistent with IPD that is closest to the
anechoic ITD.
1
We thank Michael Akeroyd for his remark about the ellipsoidal locus of
reflection locations.
2
Similar phenomena are seen in crosstalk or active-noise cancellation (e.g.,
Akeroyd et al., 2007;Bai and Lee, 2006;Elliott and Nelson, 1993). In
loudspeaker reproduction, each ear receives the summed signals of both
speakers. This is analogous to a reflection problem, where the contralateral
speaker is seen as the mirror image of the ipsilateral speaker, with a
reflecting surface that is orthogonal to the inter-speaker axis.
3
Orthodromic is equivalent to a great circle distance in spherical trigonom-
etry, i.e., the shortest distance between two points on the surface of a
sphere.
Akeroyd, M. A., Chambers, J., Bullock, D., Palmer, A. R., Summerfield,
A. Q., Nelson, P. A., and Gatehouse, S. (2007). “The binaural performance
of a cross-talk cancellation system with matched or mismatched setup and
playback acoustics,” J. Acoust. Soc. Am. 121, 1056–1069.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 25
Author's complimentary copy
Algazi, V. R., Avendano, C., and Duda, R. O. (2001). “Elevation localiza-
tion and head-related transfer function analysis at low frequencies,”
J. Acoust. Soc. Am. 109, 1110–1122.
Bai, M. R., and Lee, C.-C. (2006). “Objective and subjective analysis of
effects of listening angle on crosstalk cancellation in spatial sound
reproduction,” J. Acoust. Soc. Am. 120, 1976–1989.
Behrend, O., Dickson, B., Clarke, E., Jin, C., and Carlile, S. (2004). “Neural
responses to free field and virtual acoustic stimulation in the inferior colli-
culus of the guinea pig,” J. Neurophysiol. 92, 3014–3029.
Blauert, J. (1997). “Spatial hearing with multiple sound sources and in
enclosed spaces,” in Spatial Hearing: The Psychophysics of Human Sound
Localization (MIT Press, Cambridge, MA), pp. 201–286.
Brand, A., Behrend, O., Marquardt, T., McAlpine, D., and Grothe, B.
(2002). “Precise inhibition is essential for microsecond interaural time dif-
ference coding,” Nature 417, 543–547.
Bronkhorst, A. W., and Houtgast, T. (1999). “Auditory distance perception
in rooms,” Nature 397, 517–520.
Brown, C. H., and Gomez, R. (1992). “Functional design features in primate
vocal signals: The acoustic habitat and sound distortion,” in Topics of Pri-
matology, edited by T. Nishida, W. C. McGrew, and P. Marler (Tokyo
University Press, Tokyo), pp. 177–198.
Brown, C. H., and May, B. J. (2005). “Comparative mammalian sound
localization,” in Sound Source Localization, edited by A. N. Popper and
R. R. Fay (Springer, New York), pp. 124–178.
Brown, C. H., and Waser, P. M. (1988). “Environmental influences on the
structure of primate vocalizations,” in Primate Vocal Communication,
edited by D. Todt, P. Goedeking, and D. Symmes (Springer-Verlag, Ber-
lin), pp. 51–66.
Carr, C., and Konishi, M. (1990). “A circuit for detection of interaural
time differences in the brain stem of the barn owl,” J. Neurosci. 10,
3227–3246.
Chessell, C. I. (1977). “Propagation of noise along a finite impedance
boundary,” J. Acoust. Soc. Am. 62, 825.
Colburn, H. S. (1973). “Theory of binaural interaction based on auditory-
nerve data. I. General strategy and preliminary results on interaural dis-
crimination,” J. Acoust. Soc. Am. 54, 1458–1470.
Coleman, P. D. (1963). “An analysis of cues to auditory depth perception in
free space,” Psychol. Bull. 60, 302–315.
Cox, T., and D’Antonio, P. (2009). “Measurement of absorber properties,”
in Acoustic Absorbers and Diffusers: Theory, Design and Application
(Taylor & Francis, New-York), pp. 70–107.
Cranford, J. L. (1982). “Localization of paired sound sources in cats: effects
of variable arrival times,” J. Acoust. Soc. Am. 72, 1309–1311.
Croghan, N. B. H., and Grantham, D. W. (2010). “Binaural interference in
the free field,” J. Acoust. Soc. Am. 127, 3085–3091.
Crow, G., Rupert, A. L., and Moushegian, G. (1978). “Phase locking in
monaural and binaural medullary neurons: implications for binaural phe-
nomena,” J. Acoust. Soc. Am. 64, 493–501.
Delany, M. E., and Bazley, E. N. (1970). “Acoustical properties of fibrous
absorbent materials,” Appl. Acoust. 3, 105–116.
Dent, M. L., and Dooling, R. J. (2004). “The precedence effect in three spe-
cies of birds (Melopsittacus undulatus, Serinus canaria, and Taeniopygia
guttata),” J. Comp. Psychol. 118, 325–331.
Dent, M. L., Tollin, D. J., and Yin, T. C. T. (2009). “Influence of sound
source location on the behavior and physiology of the precedence effect in
cats,” J. Neurophysiol. 102, 724–734.
Devore, S., and Delgutte, B. (2010). “Effects of reverberation on the direc-
tional sensitivity of auditory neurons across the tonotopic axis: Influences
of interaural time and level differences,” J. Neurosci. 30, 7826–7837.
Deza, E., and Deza, M. (2006). “Geometry and distances,” in Dictionary of
Distances (Elsevier, Amsterdam), pp. 62–133.
Dizon, R. M., and Colburn, H. S. (2006). “The influence of spectral, tempo-
ral, and interaural stimulus variations on the precedence effect,” J. Acoust.
Soc. Am. 119, 2947–2964.
Duda, R. O., and Martens, W. L. (1998). “Range dependence of the response
of a spherical head model,” J. Acoust. Soc. Am. 104, 3048–3058.
Elliott, S. J., and Nelson, P. A. (1993). “Active noise control,” IEEE Sign.
Process. Mag. 10, 12–35.
Embleton, T. F. W., Piercy, J. E., and Daigle, G. A. (1983). “Effective flow
resistivity of ground surfaces determined by acoustical measurements,”
J. Acoust. Soc. Am. 74, 1239–1244.
Freyman, R. L., Balakrishnan, U., and Helfer, K. S. (2001). “Spatial release
from informational masking in speech recognition,” J. Acoust. Soc. Am.
109, 2112–2122.
Gigue`re, C., and Abel, S. M. (1993). “Sound localization: effects of rever-
beration time, speaker array, stimulus frequency, and stimulus
rise=decay,” J. Acoust. Soc. Am. 94, 769–776.
Grace, S. M., Quaranta, E., Shinn-cunningham, B. G., and Voigt, H. F.
(2008). “Simulation of the binaural environmental transfer function for
gerbils using a boundary element method,” Acta Acust. Acust. 94,
310–320.
Griffin, S. J., Bernstein, L. R., Ingham, N. J., and McAlpine, D. (2005).
“Neural sensitivity to interaural envelope delays in the inferior colliculus
of the Guinea Pig,” J. Neurophysiol. 93, 3463–3478.
Grothe, B., Pecka, M., and McAlpine, D. (2010). “Mechanisms of sound
localization in mammals,” Physiol. Rev. 90, 983–1012.
Guski, R. (1990). “Auditory localization: effects of reflecting surfaces,” Per-
ception 19, 819–830.
Hancock, K. E., and Delgutte, B. (2004 ). “A physiologically based model
of interaural time difference discrimination,” J. Neurosci. 24,
7110–7117.
Hartmann, W. M. (1983). “Localization of sound in rooms,” J. Acoust. Soc.
Am. 74, 1380–1391.
Hartmann, W. M., and Rakerd, B. (1989). “Localization of sound in rooms.
IV. The Franssen effect,” J. Acoust. Soc. Am. 86, 1366–1373.
Hartung, K., and Trahiotis, C. (2001). “Peripheral auditory processing and
investigations of the ‘precedence effect’ which utilize successive transient
stimuli,” J. Acoust. Soc. Am. 110, 1505–1513.
ISO (1993). 9613–1:1993, Acoustics—Attenuation of Sound During Propa-
gation Outdoors—Part 1: Calculation of the Absorption of Sound by the
Atmosphere (International Organization for Standardization, Geneva).
Joris, P. X. (2003). “Interaural time sensitivity dominated by cochlea-
induced envelope patterns,” J. Neurosci. 23, 6345–6350.
Joris, P., and Yin, T. C. T. (2007). “A matter of time: Internal delays in bin-
aural processing,” Trends Neurosci. 30, 70–78.
Keller, C. H., and Takahashi, T. T. (1996a). “Responses to simulated echoes
by neurons in the barn owl’s auditory space map,” J. Comp. Physiol., A
178, 499–512.
Keller, C. H., and Takahashi, T. T. (1996b). “Binaural cross-correlation pre-
dicts the responses of neurons in the owl’s auditory space map under con-
ditions simulating summing localization,” J. Neurosci. 16, 4300–4309.
Keller, C. H., and Takahashi, T. T. (2005). “Localization and identification
of concurrent sounds in the owl’s auditory space map,” J. Neurosci. 25,
10446–10461.
Kelly, J. B. (1974). “Localization of paired sound sources in the rat: Small
time differences,” J. Acoust. Soc. Am. 55, 1277–1284.
Kirikae, I., Nakamura, K., Sato, T., and Shitara, T. (1971). “A study of bin-
aural interaction,” Ann. Bulletin No. 5, Research Institute of Logopedics-
Phoniatrics, University of Tokyo, pp. 115–126.
Ko¨ppl, C., and Carr, C. (2008). “Maps of interaural time difference in the
chicken’s brainstem nucleus laminaris,” Biol. Cybern. 98, 541–559.
Kuhn, G. F. (1977). “Model for the interaural time differences in the azi-
muthal plane,” J. Acoust. Soc. Am. 62, 157–167.
Kuwada, S., Stanford, T. R., and Batra, R. (1987). “Interaural phase-
sensitive units in the inferior colliculus of the unanesthetized rabbit:
Effects of changing frequency,” J. Neurophysiol. 57, 1338–1360.
Kuwada, S., and Yin, T. C. (1983). “Binaural interaction in low-frequency
neurons in inferior colliculus of the cat. I. Effects of long interaural delays,
intensity, and repetition rate on interaural delay function,” J. Neurophy-
siol. 50, 981–999.
Leakey, D. M. (1959). “Some measurements on the effects of interchannel
intensity and time differences in two channel sound systems,” J. Acoust.
Soc. Am. 31, 977–986.
Litovsky, R. Y., Colburn, H. S., Yost, W. A., and Guzman, S. J. (1999).
“The precedence effect,” J. Acoust. Soc. Am. 106, 1633–1654.
Lochner, J. P. A., and Burger, J. F. (1964). “The influence of reflections on
auditorium acoustics,” J. Sound Vib. 1, 426–454.
Maki, K., and Furukawa, S. (2005). “Acoustical cues for sound localization
by the Mongolian gerbil, Meriones unguiculatus,” J. Acoust. Soc. Am.
118, 872–886.
Maki, K., Furukawa, S., and Hirahara, T. (2003). “Acoustical cues for
sound localization by gerbils in an ecologically realistic environment,”
presented at the Association for Research in Otolaryngology, Vol. Abst.
26, p. 89.
Maki, K., Furukawa, S., and Hirahara, T. (2005). “The effects of sounds
reflected from the ground on the spatial response fields of neurons in the
gerbil inferior colliculus,” presented at the Association for Research in
Otolaryngology (New Orleans, LA) , Vol. 28, pp. 38–39.
26 J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´ vitch and R. Brette: Impact of early reflections
Author's complimentary copy
McAlpine, D. (2005). “Creating a sense of auditory space,” J. Physiol. (Lon-
don) 566, 21–28.
McAlpine, D., Jiang, D., and Palmer, A. R. (2001). “A neural code for low-
frequency sound localization in mammals,” Nat. Neurosci. 4, 396–401.
McFadden, D. (1973). “Letter: Precedence effects and auditory cells with
long characteristic delays,” J. Acoust. Soc. Am. 54, 528–530.
Mershon, D. H., and Bowers, J. N. (1979). “Absolute and relative cues for
the auditory perception of egocentric distance,” Perception 8, 311–322.
Mershon, D. H., and King, L. E. (1975). “Intensity and reverberation as fac-
tors in the auditory perception of egocentric distance,” Percept. Psycho-
phys. 18, 409–415.
Mickey, B. J., and Middlebrooks, J. C. (2001). “Responses of auditory corti-
cal neurons to pairs of sounds: Correlates of fusion and localization,”
J. Neurophysiol. 86, 1333–1350.
Miki, Y. (1990). “Acoustical properties of porous materials—Modifications
of Delany-Bazley models,” J. Acoust. Soc. Jpn. (E) 11, 19–24.
Musicant, A. D., and Butler, R. A. (1985). “Influence of monaural spectral
cues on binaural localization,” J. Acoust. Soc. Am. 77, 202–208.
Nelson, B. S., and Takahashi, T. T. (2010). “Spatial hearing in echoic envi-
ronments: The role of the envelope in owls,” Neuron 67, 643–655.
Nielsen, S. H. (1993). “Auditory distance perception in different rooms,”
J. Audio Eng. Soc. 41, 755–770.
Ono, N., Fukamachi, S., and Sagayama, S. (2008). “Sound source localiza-
tion with front-back judgement by two microphones asymmetrically
mounted on a sphere,” J. Multimedia 3, 1–9.
Populin, L. C., and Yin, T. C. (1998). “Behavioral studies of sound localiza-
tion in the cat,” J. Neurosci. 18, 2147–2160.
Rabinowitz, W. M., Maxwell, J., Shao, Y., and Wei, M. (1993). “Sound local-
ization cues for a magnified head—Implications from sound diffraction
about a rigid sphere,” Presence Teleoperators Virtual Environ. 2, 125–129.
Rakerd, B., and Hartmann, W. M. (1985). “Localization of sound in rooms. II.
The effects of a single reflecting surface,” J. Acoust. Soc. Am. 78, 524–533.
Rakerd, B., and Hartmann, W. M. (2005). “Localization of noise in a rever-
berant environment,” in Auditory Signal Processing: Physiology, Psycho-
acoustics, and Models, edited by D. Pressnitzer, A. Cheveigne´, S.
McAdams, and L. Collet (Springer-Verlag, New York), pp. 348–354.
Rasmussen, K. B. (1981). “Sound propagation over grass covered ground,”
J. Sound Vib. 78, 247–255.
Rayleigh, Lord, and Lodge, A. (1904). “On the acoustic shadow of a sphere
with an appendix, giving the values of Legendre’s functions from P0 to P20
at intervals of 5 degrees,” Philos. Trans. R. Soc. London, Ser. A 203, 87–110.
Sheeline, C. (1982). An Investigation of the Effects of Direct and Reverber-
ant Signal Interactions on Auditory Distance Perception, Ph.D. thesis,
Standford Univeristy, pp. 63–70.
Shinn-Cunningham, B., Desloge, J. G., and Kopco, N. (2001). “Empirical
and modeled acoustic transfer functions in a simple room: Effects of dis-
tance and direction,” in IEEE Workshop On Applications Of Signal Proc-
essing To Audio And Acoustics 2001, pp. 183–186.
Shinn-Cunningham, B. G., Kopco, N., and Martin, T. J. (2005). “Localizing
nearby sound sources in a classroom: Binaural room impulse responses,”
J. Acoust. Soc. Am. 117, 3100–3115.
Shinn-Cunningham, B. G., Zurek, P. M., and Durlach, N. I. (1993).
“Adjustment and discrimination measurements of the precedence effect,”
J. Acoust. Soc. Am. 93, 2923–2932.
Spitzer, M. W., and Takahashi, T. T. (2006). “Sound localization by barn
owls in a simulated echoic environment,” J. Neurophysiol. 95, 3571–3584.
Sterbing, S. J., Hartung, K., and Hoffmann, K.-P. (2003). “Spatial tuning to
virtual sounds in the inferior colliculus of the Guinea Pig,” J. Neurophy-
siol. 90, 2648–2659.
Sutherland, L. C., and Daigle, G. A. (1998). “Atmospheric sound prop-
agation,” in Handbook Of Acoustics, edited by M. J. Crocker (Wiley-
IEEE, New York), pp. 305–330.
Thornton, J. L., Koka, K., Jones, H. G., and Tollin, D. J. (2009). “A
frequency-dependence in both the acoustic interaural time difference cues
to sound location and their encoding by neurons in the inferior colliculus,”
presented at the ARO, Abstract 529.
Thurlow, W. R., Mangels, J. W., and Runge, P. S. (1967). “Head movements
during sound localization,” J. Acoust. Soc. Am. 42, 489–493.
Tollin, D. J., and Yin, T. C. (2003). “Spectral cues explain illusory elevation
effects with stereo sounds in cats,” J. Neurophysiol. 90, 525–530.
Trahiotis, C., and Hartung, K. (2002). “Peripheral auditory processing, the
precedence effect and responses of single units in the inferior colliculus,”
Hear. Res. 168, 55–59.
Truax, B. (1999). Handbook for Acoustic Ecology, 2nd ed, Compact Disc
(Cambridge Street Publishing, Cambridge).
Wagner, H., Asadollahi, A., Bremen, P., Endler, F., Vonderschen, K., and
von Campenhausen, M. (2007). “Distribution of interaural time difference
in the barn owl’s inferior colliculus in the low- and high-frequency
ranges,” J. Neurosci. 27, 4191–4200.
Wallach, H., Newman, E. B., and Rosenzweig, M. R. (1949). “The prece-
dence effect in sound localization,” Am. J. Psychol. 62, 315–336.
Waser, P. M., and Brown, C. H. (1986). “Habitat acoustics and primate
communication,” Am. J. Primatol. 10, 135–154.
Weideman, J. A. C. (1994). “Computation of the Complex Error Function,”
SIAM J. Numer. Anal. 31, 1497–1518.
Wendt, K. (1963). “Das Richtungsho¨ren bei der U
¨berlagerung zweier
Schallfelder bei Intensita¨ts- und Laufzeitstereophonie” (“Directional hear-
ing with two superimposed sound fields in intensity- and delay-difference
stereophony”), Ph.D. thesis, Rheinisch-Westfa¨lische Technische Hoch-
schule, Aachen, pp. 1–107.
Wightman, F. L., and Kistler, D. J. (1992). “The dominant role of low-
frequency interaural time differences in sound localization,” J. Acoust.
Soc. Am. 91, 1648–1661.
Wiley, R. H., and Richards, D. G. (1978). “Physical constraints on acoustic
communication in the atmosphere: Implications for the evolution of ani-
mal vocalizations,” Behav. Ecol. Sociobiol. 3, 69–94.
Wyttenbach, R. A., and Hoy, R. R. (1993). “Demonstration of the prece-
dence effect in an insect,” J. Acoust. Soc. Am. 94, 777–784.
Xiao, X., and Grantham, D. W. (1997). “The effect of a free-field auditory
target’s motion on its detectability in the horizontal plane,” J. Acoust. Soc.
Am. 102, 1907–1910.
Yan-Chen Lu, and Cooke, M. (2010). “Binaural estimation of sound
source distance via the direct-to-reverberant energy ratio for static and
moving sources,” IEEE Trans. Audio Speech Lang. Process. 18,
1793–1805.
Yin, T. C. (1994). “Physiological correlates of the precedence effect and
summing localization in the inferior colliculus of the cat,” J. Neurosci. 14,
5170–5186.
Yin, T. C., and Chan, J. C. (1990). “Interaural time sensitivity in medial
superior olive of cat,” J. Neurophysiol. 64, 465–488.
Zahorik, P. (2002). “Assessing auditory distance perception using virtual
acoustics,” J. Acoust. Soc. Am. 111, 1832.
Zahorik, P., Brungart, D. S., and Bronkhorst, A. W. (2005). “Auditory dis-
tance perception in humans: A summary of past and present research,”
Acta Acust. Acust. 91, 409–420.
Zurek, P. M. (1987). “The precedence effect,” in Directional Hearing,
edited by W. A. Yost and G. Gourevitch (Springer-Verlag, New York),
pp. 85–106.
J. Acoust. Soc. Am., Vol. 132, No. 1, July 2012 B. Goure´vitch and R. Brette: Impact of early reflections 27
Author's complimentary copy
... Thus, the perception is in compliance with the movement of a physical sound source. A possible explanation of this effect can be found in [19], which studies the impact of early reflections on binaural cues by means of numerical models. They could show that an early reflection produces a discontinuity in the inter-aural phase difference at the interference frequency f , which can be arbitrarily large and yields an ITD that can take any value between ±1/(2 f ). ...
... They could show that an early reflection produces a discontinuity in the inter-aural phase difference at the interference frequency f , which can be arbitrarily large and yields an ITD that can take any value between ±1/(2 f ). Based on this insight, the authors in [19] conclude that floor reflections with short delays provide reliable cues for elevation estimation by changing ITDs. It is possible that they are learned in the same way as pinna cues and small angles of the head are sufficient to give a directional sense. ...
... For such a condition the effect size was found to be comparable to vertical VBAP. Agreeing with [19] we believe that median plane reflections produce localization cues which are encoded by the auditory system. ...
Conference Paper
Full-text available
This contribution considers how delay alterations of a median plane reflection influence the vertical auditory movement. Depending on the excitation signal we could show two partly contrasting effects dominating our perception: For broadband noise signals a delay modification between the leading direct sound and the lagging reflection yields spectral cues and the auditory movement is explained by the pitch-height effect. In contrast, results obtained for speech signals indicate that other localization cues induced by the interference pattern of direct sound and reflection influence our perception. Respective movement directions are deduced from the movement of a physical sound source, suggesting that the auditory system learns to associate interference patterns to sound source movements.
... Underestimation of ITD when using the sphere model can be explainable by these geometrical and acoustical difference between a sphere and an actual head. In addition, Gourévitch and Brette [17] indicated that ITD varies when there exist reflections from nearby walls. This phenomenon may be similar to the phenomenon considered in the current work because both phenomena can be considered as effects of reflections, or backscatter diffraction, on ITDs. ...
... This phenomenon may be similar to the phenomenon considered in the current work because both phenomena can be considered as effects of reflections, or backscatter diffraction, on ITDs. Although the walls assumed in [17] are located separated from the head and more distant while the current work discusses the reflections from head itself, it would be intriguing to see if there is a similar physical mechanism in future works. ...
Article
Full-text available
The interaural time difference (ITD) plays an important role in spatial hearing, particularly in azimuthal localization of sound images. Although the ITD is essentially determined by the geodesic distance between two ears, researchers have reported that the ITD is greater for lower frequencies. However, the origin of this frequency-dependence has not been revealed. This study investigates how the ITD is physically characterized to have a frequency-dependent nature by conducting measurements and numerical simulations. Dummy head measurements show that the ITD varies with frequency because the apparent propagation time to the ipsilateral ear decreases for low frequency. Dummy head simulations confirmed this phenomenon and revealed that the apparent propagation time decreases because of a sound pressure phase shift due to reflections from the head. Circular plate simulations revealed that the circular profile including its lateral surface and edge produces reflections that are relevant to the phase shift, yielding the frequency-dependence of the apparent propagation time. Furthermore, rigid sphere simulations showed that such reflections are produced even by smooth convex surfaces without clear-cut edges. These results strongly suggest that a major factor in the production of the frequency-dependence of ITDs is backscatter diffractions from convex surfaces of the head and the pinna.
... The RIR is windowed with a sliding rectangular window of 1 ms, and the TOA of a reflection is defined as the time index where the local energy is 3 times higher than the median energy in the window [15]. The window size ensures a high temporal resolution in order to capture the perceptually important floor reflection [16,17]. For each detected reflection, the RMS amplitude a ref n in an asymmetrical window of 1.5 ms (similar as described above for the direct sound), as well as the peak amplitude, is calculated. ...
Article
Full-text available
Parametric spatial audio rendering is a popular approach for low computing capacity applications, such as augmented reality systems. However most methods rely on spatial room impulse responses (SRIR) for sound field rendering with 3 degrees of freedom (DoF), i.e., for arbitrary head orientations of the listener, and often require multiple SRIRs for 6-DoF rendering, i.e., when additionally considering listener translations. This paper presents a method for parametric spatial audio rendering with 6 DoF based on one monaural room impulse response (RIR). The scalable and perceptually motivated encoding results in a parametric description of the spatial sound field for any listener’s head orientation or position in space. These parameters form the basis for the binaural room impulse responses (BRIR) synthesis algorithm presented in this paper. The physical evaluation revealed good performance, with differences to reference measurements at most tested positions in a room below the just-noticeable differences of various acoustic parameters. The paper further describes the implementation of a 6-DoF real-time virtual acoustic environment (VAE) using the synthesized BRIRs. A pilot study assessing the plausibility of the 6-DoF VAE showed that the system can provide a plausible binaural reproduction, but it also revealed challenges of 6-DoF rendering requiring further research.
... Previous studies have examined the impact of various degrees of acoustic conditions on binaural cues, such as ground reflections for small animals [39], or subjective perception of an auditory scene [40]. There are surprisingly few studies which have examined the impact of room acoustics on localization in particular, with tested conditions often being quite limited, and results varying between studies. ...
Article
Full-text available
This study examines the efficiency of a training protocol using a virtual reality application designed to accelerate individual’s selection of, and accommodation to, non-individualized HRTF profiles. This training introduces three elements to hasten audio localization performance improvement: an interactive HRTF selection method, a parametric training program based on active learning, and a relatively dry room acoustic simulation designed to increase the quantity of spatial cues presented. Participants rapidly selected an HRTF (≈5 min) followed by training over three sessions of 12 min distributed over 5 days. To study the impact of the room acoustic component on localization performance evolution, participants were divided into two groups: one acting as control reference, training with only anechoic renderings, the other training in reverberant conditions. The efficiency of the training program was assessed across groups and the entire protocol was assessed through direct comparisons with results reported in previous studies. Results indicate that the proposed training program led to improved learning rates compared to that of previous studies, and that the included room response accelerated the learning process.
... This is due to the complex sound field within the small enclosed space, including the acoustic modes at low frequency and fast attenuation at high frequency caused by boundary absorption [11]. The resulting special physical sound field and complex acoustic environment are substantially different from the large acoustic space, and more different from the free field, which makes the law and mechanism of auditory distance perception in small enclosed spaces substantially different from that in the existing works in free field [12] and large acoustic space [8]. Therefore, the auditory distance perception in small enclosed spaces needs further investigation. ...
Conference Paper
Full-text available
The direct-to-reverberation ratio (D/R) is regarded as a cue for absolute auditory distance perception in reflective environments. This is based on the assumption that the late reverberation from the environment can be approximated as a diffused field. This assumption is basically appropriate for enclosed spaces with appropriate size. However, for small enclosed spaces such as inside a car, the reflections are no longer a diffused reverberation field. This is due to the complex sound field within the small enclosed space, including the acoustic modes at low frequencies and fast attenuation at high frequencies caused by boundary absorption. This work reexamines the contribution of reflections to auditory distance perception in small enclosed spaces. The sound field and binaural room impulse responses (BRIRs) in a car were simulated by using combined numerical simulation of Finite Element Method (FEM) and Ray Tracking Method (RTM). Then, the auditory distance perception was evaluated by auralization technique, that is, evaluated by convoluting the mono stimuli with simulated BRIRs and rendering through earphones. The results indicate that reflections in small enclosed space contribute to auditory distance perception to some extent, but the effect is limited compared with those in previous works of large enclosed spaces.
... Early reflections-those following the direct signal by 50 ms or less (Bradley et al., 2003)-can disrupt low-frequency sound localisation in realistic auditory environments by conveying spurious ITD cues in the TFS of sounds (Gourévitch & Brette, 2012). Despite this, normal-hearing listeners are highly adept at locating the true source of a sound in reverberant environments including speech in reverberant environments. ...
Preprint
Listeners perceive sound energy as originating from the direction of its source, even when followed only milliseconds later by reflected energy from multiple different directions. Here, modelling responses of brainstem neurons responsible for encoding auditory spatial cues, we demonstrate that accurate localisation in reverberant environments relies on pre- and post-synaptic specializations that emphasise spatial information in early-arriving sound energy. Synaptic depression in spherical bushy cells (SBCs) of the cochlear nucleus, and outwardly-rectifying membrane currents in SBCs and neurons of the medial superior olive (MSO) to which SBCs project bilaterally, are necessary to account for human listening performance. These biophysical properties appear suited to efficient coding of spatial information, emphasising early-arriving spatial information, particularly at sound frequencies where reverberant energy is relatively intense. Applied to lateralisation of human speech in a virtual reverberant room, we show that frequency-dependent membrane properties enhance correct, over spurious, localisation cues at the earliest stages of spatial processing.
... In other words, the slipped cycle ITD is a strong localization cue in our experiments as revealed by frequency-averaged correlation functions (Timing 0 and Level À in Fig. 4) and the localization results in Level À (Fig. 2). Note that the problem of a "wrong" side ITD is not limited to the stereophonic conditions we tested and could include situations with multiple sources and room reflections from nearby surfaces (Gourevitch and Brette, 2012). Our findings extend the puretone observations of Hartmann and colleagues and further suggest that consistent ILDs help strengthen the ITD information from the primary source. ...
Article
Full-text available
Auditory spatial perception relies on more than one spatial cue. This study investigated the effects of cue congruence on auditory localization and the extent of visual bias between two binaural cues-interaural time differences (ITDs) and interaural level differences (ILDs). Interactions between these binaural cues were manipulated by stereophonic techniques. The results show that incoherent binaural information increased auditory response noise and amplified visual bias. The analysis further suggests that although ILD is not the dominant cue for low-frequency localization, it may strengthen the position estimate by combining with the dominant ITD information to minimize estimation noise.
... Gourévitch et al. [55] propusieron que las reflexiones pueden contribuir a las claves binaurales de distancia cuando las propiedades reflectantes de la superficie son conocidas y estables, por ejemplo, el piso. ...
Thesis
Full-text available
El objetivo del presente trabajo es investigar las posibles aplicaciones de los cristales sónicos en el control del espacio acústico. Un cristal sónico la forma más simple de obtener un metamaterial acústico y se construye de un arreglo cristalino de cilindros en aire. Los mismos poseen propiedades de transmisión y reflexión no triviales. Ejemplos de esto son la focalización de la energía acústica, las bandas prohibidas, la birrefringencia, etc. En la primera parte de la tesis se investigó el efecto de utilizar estos materiales como "paredes" de un recinto de reproducción, aprovechando sus propiedades de reflexión. Se realizaron modelos computacionales utilizando un método desarrollado en el presente trabajo que combina Multiple Scattering y técnicas de acústica geométrica (trazado de rayos) para obtener la respuesta del decaimiento energético en función de la frecuencia. Luego se llevó a cabo una verificación experimental de los resultados obtenidos con el modelo. Los mismos indican que utilizando cristales sónicos como paredes es posible obtener una sala con una respuesta acústica no trivial, y que la misma puede ser “ecualizada” modificando los parámetros del arreglo cristalino del cristal sónico sin necesidad de realizar modificaciones en la geometría de la sala. Dentro de las aplicaciones del estudio sobre las cavidades resonantes, aparece el diseño y la construcción de módulos de cristal sónico para aplicaciones en acústica de salas. Estos dispositivos tienen como fin proporcionar una nueva herramienta acústica para la performance en obras de música contemporánea. Luego, se propuso aprovechar las propiedades de transmisión de los cristales sónicos para utilizarlos como un medio para desplazar virtualmente una fuente sonora utilizando medios puramente acústicos. En este caso se llevó a cabo un experimentos para determinar el efecto de colocar una porción de cristal sónico entre un oyente y una fuente sonora en la percepción auditiva de distancia. Los resultados indican que, para el rango de frecuencias de la focalización, el cristal sónico genera una ilusión de proximidad de la fuente. Si bien este efecto puede ser atribuido al aumento de la intensidad del sonido debido a la focalización, experimentos complementarios realizados con intensidad ecualizada demuestran que el efecto persiste aún cuando no existen diferencias de intensidad. En particular, se encontró que la correlación interaural juega un rol en el acercamiento de la fuente para este tipo de configuración.
Thesis
Binaural rendering aims to immerse the listener in a virtual acoustic scene, making it an essential method for spatial audio reproduction in virtual or augmented reality (VR/AR) applications. The growing interest and research in VR/AR solutions yielded many different methods for the binaural rendering of virtual acoustic realities, yet all of them share the fundamental idea that the auditory experience of any sound field can be reproduced by reconstructing its sound pressure at the listener's eardrums. This thesis addresses various state-of-the-art methods for 3 or 6 degrees of freedom (DoF) binaural rendering, technical approaches applied in the context of headphone-based virtual acoustic realities, and recent technical and psychoacoustic research questions in the field of binaural technology. The publications collected in this dissertation focus on technical or perceptual concepts and methods for efficient binaural rendering, which has become increasingly important in research and development due to the rising popularity of mobile consumer VR/AR devices and applications. The thesis is organized into five research topics: Head-Related Transfer Function Processing and Interpolation, Parametric Spatial Audio, Auditory Distance Perception of Nearby Sound Sources, Binaural Rendering of Spherical Microphone Array Data, and Voice Directivity. The results of the studies included in this dissertation extend the current state of research in the respective research topic, answer specific psychoacoustic research questions and thereby yield a better understanding of basic spatial hearing processes, and provide concepts, methods, and design parameters for the future implementation of technically and perceptually efficient binaural rendering.
Article
This study investigated the effects of frequency of the sound source, source-to-subject distance, orientation angle of the sound source, and cue of source on the error in judging the angle of orientation. Three levels of frequency of sound: 0.125, 1.0, and 6.0 kHz; three levels of source-to-subject distance: 1-, 2-, and 3-m; eight levels of angled orientation of sound: from 0° (in the front of subject) to 315°, in which every level is 45°, 180° represent behind the subject; two cues of source: two-ear (i.e., binaural hearing) and single-ear (i.e., better-hearing ear) were tested. A 3 (frequency of sound source) × 8 (angled orientation of the sound source) × 3 (source-to-subject distance) repeated-measures design was used. Results indicated that all four independent variables had significant effects on the error in judging the angle of orientation. The error in judging the angle of orientation for 6.0 kHz was greater than those for 0.125 kHz and 1.0 kHz. For the source-to-subject distance, 3-m resulted in the greater error in judging the angle of orientation. For the angled orientation of sound source, 315°, 90°, or 270° resulted in best localization performance; however, 0° and 180° resulted in the worst localization performance. The error in judging the angle of orientation for binaural hearing was smaller than for the better-hearing ear. The interaction effects of frequency of the sound × source-to-subject distance and frequency of the sound × angled orientation of sound source were statistically significant. In conclusion, the error in judging the angle of orientation was best with binaural hearing, 0.125 kHz, 45°, and 1-m.
Article
1. We studied the sensitivity of cells in the medial superior olive (MSO) of the anesthetized cat to variations in interaural phase differences (IPDs) of low-frequency tones and in interaural time differences (ITDs) of tones and broad-band noise signals. Our sample consisted of 39 cells histologically localized to the MSO. 2. All but one of the cells had characteristic frequencies less than 3 kHz, and 79% were sensitive to ITDs and IPDs. More than one-half (56%) of the cells responded to monaural stimulation of either ear, and both the binaural and monaural responses were highly phase locked. All of the cells that were sensitive to IPDs and monaurally driven by either ear responded in accord with that predicted by the coincidence model of Jeffress, as judged by comparisons of the phases at which the monaural and binaural responses occurred. The optimal IPDs were tightly clustered between 0.0 and 0.2 cycles. Most cells exhibited facilitation of the response at favorable ITDs and inhibition at unfavorable ITDs compared with the monaural responses. 3. Cells in the MSO exhibited characteristic delay, as judged by a linear relationship between the mean interaural phase and stimulating frequency. Characteristic phases were clustered near 0 indicating the most cells responded maximally when the two input tones were in phase. With the use of the binaural beat stimulus we found no differential selectivity for either the direction or speed of interaural phase changes. 4. The cells were also sensitive to ITDs of broad-band noise signals. The ITD curve in response to broad-band noise was similar to that predicted by the composite curve, which was calculated by linearly summating the tonal responses over the frequencies in the response area of the cell. Most (93%) of the peaks of the composite curves were between 0 and +400 microseconds, corresponding to locations in the contralateral sound field. Moreover, computer cross correlations of the monaural spike trains were similar to the ITD curve generated binaurally for both correlated and uncorrelated noise signals to the two ears. Thus our data suggest that the cells in the MSO behave much like cross-correlators. 5. By combining data from different animals and lcoating each cell on a standard MSO, we found evidence for a spatial map of ITDs across th