Conference PaperPDF Available

Reproduction of Virtual Sound Sources Moving at Supersonic Speeds in Wave Field Synthesis

Authors:
Audio Engineering Society
Convention Paper
Presented at the 125th Convention
2008 October 2–5 San Francisco, CA, USA
The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have
been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from
the author’s advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes
no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio
Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights
reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the
Journal of the Audio Engineering Society.
Reproduction of Virtual Sound Sources
Moving at Supersonic Speeds in Wave Field
Synthesis
Jens Ahrens and Sascha Spors
Deutsche Telekom Laboratories, Technische Universit¨at Berlin, Ernst-Reuter-Platz 7, 10587 Berlin, Germany
Correspondence should be addressed to Jens Ahrens (jens.ahrens@telekom.de)
ABSTRACT
In conventional implementations of wave field synthesis, moving sources are reproduced as sequences of
stationary positions. As reported in the literature, this process introduces various artifacts. It has been
shown recently that these artifacts can be reduced when the physical properties of the wave field of moving
virtual sources are explicitly considered. However, the findings were only applied to virtual sources moving
at subsonic speeds. In this paper we extend the published approach to the reproduction of virtual sound
sources moving at supersonics speeds. The properties of the actually reproduced sound field are investigated
via numerical simulations.
1.INTRODUCTION
Since several decades, the problem of physically
recreating a given wave field has been addressed
in the audio community. Independent of the cho-
sen approach, two rendering techniques exist: Data
based and model based reproduction [1]. The for-
mer case aims at perfectly reproducing a captured
sound field. This situation will not be treated in
this paper. We concentrate on the latter case where
a sound scene is composed of a number of virtual
sound sources derived from analytical spatial source
models. For stationary virtual scenes accurate re-
production techniques exist. However, the reproduc-
tion of dynamic scenes implicates certain peculiari-
ties. This is mostly due to the fact that the speed
of sound in air is constant. When a source moves,
the propagation speed of the emitted wave field is
not affected. However, the emitted wave field differs
from that of a static source in various ways. For ex-
ample, in sources moving slower than the speed of
sound, the sound waves emitted in the direction of
motion experience an increase in frequency. Sound
Ahrens AND Spors Supersonic sources
waves emitted in opposite direction of motion expe-
rience a decrease in frequency. The whole of these
alterations is known as Doppler Effect [2].
Typical implementations of sound field reproduction
systems do not take the Doppler Effect into account.
Dynamic virtual sound scenes are rather reproduced
as a sequence of stationary snapshots. Thus, not
only the virtual source but also its entire wave field
is moved from one time instant to the next.
This concatenation leads to Doppler-like frequency
shifts. However, these frequency shifts occur due
to warping of the time axis rather than due to the
constant speed of sound, a circumstance which intro-
duces artifacts. Furthermore, this approach is lim-
ited to the reproduction of virtual sources moving
slower than the speed of sound. The artifacts have
been recently discussed in the literature in the con-
text of wave field synthesis [3]. We are not aware
of an according publication focussing on alternative
sound field reproduction methods. See [4, 5] for
a treatment of moving virtual sources in binaural
(HRTF-based) reproduction.
Various alternative implementations of the conven-
tional approach of concatenating stationary source
positions as outlined above are being applied both
frame-based as well as in a sample-by-sample fash-
ion. Most notably, in [3] it is proposed to incorporate
the retarded time of a moving source (see section 2)
into the driving function of a stationary source. Re-
sults presented ibidem show that this strategy still
leaves prominent artifacts.
As shown by the authors in [6], the mentioned arti-
facts occurring in conventional implementations can
be avoided when the physical properties of the wave
field of moving sound sources are a priori taken into
account. However, the approach in [6] was exclu-
sively applied to virtual sources moving slower than
the speed of sound. In this paper, we extend this
approach to the reproduction of virtual sources mov-
ing at supersonic speeds. Our work can also be re-
garded as an extension of the approach presented
in [7] which focuses on the reproduction of the fre-
quency content present in supersonic booms of air-
crafts but does not physically reproduce the actual
wave front.
Note that the considerations presented in this paper
are of relevance only for sound field reproduction ap-
proaches which employ time delays in the procedure
of yielding the loudspeaker driving signals.
αx
x
y
y=y0
r
n
x
x0
x0
Fig. 1: The coordinate system and geometry used
in this paper. The dots denote the positions of the
secondary sources used for wave field synthesis. The
grey-shaded area denotes the listening area.
2.THE WAVE FIELD OF A MOVING SOURCE
The fundamental prerequisite for model-based sound
field reproduction is the knowledge of the sound field
that is to be recreated. In this section, we derive
analytical expressions of the sound field of a moving
sound source. For simplicity, we assume a monopole
source. However, the presented approach also allows
for the treatment of arbitrary source types. The
derivation below follows [8, 9].
The time-domain free-field Green’s function of a
stationary sound source at position xs, i.e. its
spatio-temporal impulse response, is denoted by
g(xxs, t). See figure 1 for a sketch of the
coordinate system. The time-domain Green’s
function of a moving sound source is then
gxxs(˜
t(x, t)), t ˜
t(x, t), whereby ˜
t(x, t) de-
notes the time instant when the impulse was emit-
ted. Confer to figure 2. gxxs(˜
t(x, t)), t ˜
t(x, t)
is referred to as retarded Green’s function [8]. ˜
t(x, t)
is dependent on the location of the receiver xand
the time tthat the receiver experiences.
Assume a monochromatic harmonic source oscillat-
ing at angular frequency ωs. Its source function s0(˜
t)
reads in complex notation
s0(˜
t) = a0·es˜
t.(1)
In order to yield the wave field produced by a mov-
ing source with spatio-temporal impulse response
gxxs(˜
t(x, t)), t ˜
t(x, t)driven by the signal
s0(˜
t), we model s0(˜
t) as a dense sequence of weighted
Dirac pulses. Each Dirac pulse of the sequence
multiplied by gxxs(˜
t(x, t)), t ˜
t(x, t)yields the
wave field created by the respective Dirac pulse. To
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 2 of 11
Ahrens AND Spors Supersonic sources
x
y
xs(˜
t(x, t)) xs(t)
x
|x|
|xxs(˜
t(x, t))|
v
Fig. 2: Derivation of the Green’s function of a moving sound source.
yield the wave field emitted due to the entire se-
quence of Dirac pulses, we integrate over ˜
tas
s(x, t) =
Z
−∞
s0(˜
t)·gxxs(˜
t), t ˜
td˜
t , (2)
whereby we temporarily altered the nomenclature
for convenience (˜
t=˜
t(x, t)).
Assuming a moving monopole sound source, its
Green’s function explicitly reads
gxxs(˜
t(x, t)), t ˜
t(x, t)=
=1
4π
δt˜
t(x, t)|xxs(˜
t(x,t))|
c
|xxs(˜
t(x, t))|.(3)
Note that
τ(x, t) = |xxs(˜
t(x, t))|
c(4)
is referred to as retarded time [8]. It denotes the du-
ration of sound propagation from the source to the
receiver. In the remainder of this paper, M=v
cde-
notes the Mach number, with vbeing the speed of
the sound source.
For convenience, we assume the virtual source to
move uniformly along the x-axis in positive x-
direction (cf. to figure 2). As outlined in [6], ar-
bitrary trajectories can be approximated by assum-
ing a piece-wise uniform motion and an appropriate
translation and rotation of the coordinate system.
At time t= 0 the source is located at position xs(0).
For this particular source trajectory, the integral in
equation (2) can be solved via the substitution
u=˜
t(x, t) + τ(x, t) (5)
and the exploitation of the sifting property of the
delta function [10]. It turns out that the integral has
different solutions for M < 1, M= 1, and M > 1.
In the following sections, we present solutions to the
integral in (2) for subsonic (M < 1) as well as super-
sonic (M > 1) sound sources and briefly comment
on the case of sources moving at the speed of sound
(M= 1).
2.1.Sound sources moving at subsonic speeds
For M < 1, the integral boundaries in (2) can be
kept and the solution, i.e. the sound field sM<1(x, t)
of a source moving at a speed v < c reads then
sM<1(x, t) = 1
4π·s0(˜
t(x, t))
Ψ(x, t),(6)
whereby
˜
t(x, t) = tMΦ(x, t) + Ψ(x, t)
c(1 M2),
Ψ(x, t) = pΦ2(x, t) + y2(1 M2),
Φ(x, t) = xvt xs(0) .
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 3 of 11
Ahrens AND Spors Supersonic sources
−2 0 2
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(a) ℜ{s(x, t0)}
−2 0 2
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(b) ℜ{pWFS(x, t0)}. The loudspeaker array indicated by
the dotted line is situated symmetrically around the y-
axis at y0= 1 m and its overall length is 8 m. The
loudspeakers are positioned at intervals of ∆x= 0.1 m.
Tapering is applied.
Fig. 3: Simulated wave fields of a source oscillating monochromatically at fs= 500 Hz and moving along
the x-axis in positive x-direction at v= 120 m
s. Due to the employment of the complex notation for time
domain signals (see equation (1)), only the real part ℜ{·} of the considered wave field is depicted. The wave
fields have been scaled to have comparable levels. The values of the sound pressure are clipped as indicated
by the colorbars.
A snapshot of the wave field of a moving sound
source described by equation (6) is depicted in figure
3(a).
For M= 0, i.e. a static source, equation (6) reads
sM=0(x, t) = 1
4π·s0(tτ)
|xxs|(7)
which corresponds to the familiar expression for the
sound field of a static harmonic monopole sound
source [6].
2.2.Sound sources moving at supersonic speeds
For sound sources moving at supersonic speeds, the
integral in (2) has to be split into a sum of two in-
tegrals after the substitution (5) reading
sM>1(x, t) =
Z
u1
(·)du +
Z
u2
(·)du, (8)
whereby
u1,2=1
v±(xs(0) x) + ypM21.
(·) denotes the argument of the integral in (2).
The solution yields the wave field sM>1(x, t) of a
monopole sound source moving at a supersonic speed
vreading
sM>1(x, t) =
=
s1(x, t) + s2(x, t) for Φ(x, t)2+y2(1 M2)
0
and xs(0) + vt x
0 elsewhere ,
(9)
with
s1,2(x, t) = 1
4π
s0(˜
t1,2(x, t))
Ψ(x, t),
˜
t1,2(x, t) = tMΦ(x, t)±Ψ(x, t)
c(1 M2),
The most prominent property of the wave field of a
supersonic source is the formation of the so-called
Mach cone, a conical sound pressure front following
the moving source. See figure 4(a). Note that the
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 4 of 11
Ahrens AND Spors Supersonic sources
−4 −2 0 2
−3
−2
−1
0
1
2
3
−1
−0.5
0
0.5
1
x[m]
y[m]
(a) Wave field sM>1(x, t) of a supersonic source.
−4 −2 0 2
−3
−2
−1
0
1
2
3
−1
−0.5
0
0.5
1
x[m]
y[m]
(b) Backward travelling component s1(x, t).
−4 −2 0 2
−3
−2
−1
0
1
2
3
−1
−0.5
0
0.5
1
x[m]
y[m]
(c) Forward travelling component s2(x, t).
Fig. 4: Wave field of a source traveling at 600 m
/s
(M1.7).
Mach cone is a direct consequence of causality.
For the receiver this has two implications: (1)
He/She does not receive any sound wave before the
arrival of the Mach cone, (2) after the arrival of the
Mach cone the receiver is exposed to a superposition
of the wave field which the source radiates into back-
ward direction s1(x, t) and the wave field s2(x, t)
which the source had radiated into forward direc-
tion before the arrival of the Mach cone. s1(x, t)
carries a frequency shifted version of the emitted sig-
nal propagating in opposite direction to the source
motion (figure 4(b)), s2(x, t) carries a time-reversed
version of the emitted signal following the source
(figure 4(c)). The latter is generally also shifted in
frequency.
2.3.Sound sources moving at the speed of sound
The integral in (2) can also be solved for M= 1. In
that case, the lower integral boundary is finite, the
upper boundary is infinite. The result then resem-
bles the circumstances for M > 1, i.e the receiver is
not exposed to the source’s wave field at all times.
It is rather such that the source moves at the lead-
ing edge of the sound waves it emits. The wave field
can not surpass the source. The leading edge of the
wave field is termed sound barrier.
Unlike for M > 1, the resulting wave field is not
composed of two different components. It contains
only one single component carrying the frequency
shifted input signal.
Informal listening suggests that it can not be as-
sumed that the human ear is aware of the details of
the properties of the wave field of a transonic source
(a source moving exactly at the speed of sound).
We therefore do not present an explicit treatment
here. For convenience, we propose to assume that
the wave field of a transonic source is perceptually
indistinguishable from the wave field s1(x, t) of a
source moving at a speed slightly faster than the
speed of sound c.
3.WAVE FIELD SYNTHESIS
In this section, we demonstrate how a moving vir-
tual sound source can be reproduced using the find-
ings derived in section 2. Exemplarily, we use wave-
field synthesis (WFS) employing a linear array of
secondary sources (loudspeakers).
The theoretical basis of WFS employing linear sec-
ondary source arrays is given by the two-dimensional
Rayleigh I integral [11, 12]. It states that a linear
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 5 of 11
Ahrens AND Spors Supersonic sources
distribution of monopole line sources is capable of
reproducing a desired wave field (a virtual source)
in one of the half planes defined by the secondary
source distribution. The wave field in the other half
(where the virtual source is situated) is a mirrored
copy of the desired wave field. For convenience, the
secondary source array is assumed to be parallel to
the x-axis at y=y0as depicted in figures 1 and
3(b). The listening area is chosen to be at y > y0.
The two-dimensional Rayleigh I integral determines
the sound pressure pWFS (x, t) created by such a
setup reading
pWFS(x, t) =
Z
−∞
ns(x, t)|x=x0
| {z }
d(x0,t)
tg(x, t)dx0.
(10)
s(x, t) denotes the sound field of the virtual source
and
nthe gradient in the direction normal to the
secondary source distribution (confer also to figure
1). The asterisk tdenotes convolution with respect
to time.
The driving function d(x0, t) for a loudspeaker at po-
sition x0is thus yielded by evaluating the gradient
of the desired virtual sound field in direction normal
to the loudspeaker distribution at the position of the
respective loudspeaker.
Due to the fact that the physical requirements can
not be perfectly fulfilled in practical implementa-
tions, the virtual source’s wave field is not perfectly
reproduced in the receiver’s half-space. Equation
(10) requires an infinitely long continuous distri-
bution of secondary sources, practical implementa-
tions can only employ a finite number of discrete
loudspeakers. The array has thus a finite length.
Furthermore, equation (10) requires secondary line
sources which are positioned perpendicular to the
receiver plane [12]. Practical implementations typ-
ically employ loudspeakers with closed cabinets as
secondary sources. These are more accurately de-
scribed by point sources rather than line sources.
This fact is known as secondary source mismatch
and has to be compensated for as
dcorr(x, t) = f(t)td(x, t).(11)
f(t) is a filter with frequency response F(ω) =
22πjkdref , the asterisk tdenotes convolution with
respect to time, and dref denotes the reference dis-
tance from the secondary source array, to which the
amplitude of the reproduced wave field is referenced.
See [12] for a thorough treatment of the properties
of WFS.
For convenience, we do not explicitly compensate for
the secondary source mismatch in the analytical ex-
pressions for the driving functions. However, in the
simulations this compensation is performed.
3.1.Driving function for subsonic sources
For a virtual harmonic monopole sound source of
angular frequency ωsmoving uniformly along the x-
axis as described in section 2, the driving function
d(x, t) derived from (6) and (10) reads [6]
dsub(x, t) = y(1 M2)
Ψ(x, t)1
Ψ(x, t)+s
c(1 M2)×
×s(x, t).(12)
Note that dsub(x, t) in equation (12) implicitly in-
cludes static virtual sources.
The wave field reproduced by a linear WFS array
driven by equation (12) is depicted in figure 3(b).
The overall length of the loudspeaker array is 8 m.
The virtual source moves at a speed v= 120 m
salong
the x-axis in positive x-direction (M1
3).
3.2.Driving function for supersonic sources
The driving function for supersonic sources derived
from (9) and (10) reads
dsup(x, t) = d1(x, t) + d2(x, t) =
=y(1 M2)
Ψ(x, t)1
Ψ(x, t)+s
c(1 M2)×
×s1(x, t) +
+y(1 M2)
Ψ(x, t)1
Ψ(x, t)s
c(1 M2)×
×s2(x, t).(13)
3.3.Driving function for transonic sources
As outlined in section 2.3, we propose to reproduce
s1(x, t) of a virtual source moving slightly faster
than the speed of sound in order to approximate a
transonic source. The appropriate driving function
is then d1(x, t).
4.RESULTS
In this section, we present a number of simulations
in order to analyze the properties of the proposed
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 6 of 11
Ahrens AND Spors Supersonic sources
0 0.5 1 1.5 2
350
400
450
500
550
600
650 [dB]
−80
−70
−60
−50
−40
−30
−20
t[s]
f[Hz]
(a) Real source. The emitted frequency is
500 Hz.
0 0.5 1 1.5 2
350
400
450
500
550
600
650 [dB]
−80
−70
−60
−50
−40
−30
−20
t[s]
f[Hz]
(b) Truncation artifacts. The length of the
array is 40 m. The emitted frequency is
500 Hz.
0 0.5 1 1.5 2
3000
3500
4000
4500
5000 [dB]
−80
−70
−60
−50
−40
−30
−20
t[s]
f[Hz]
(c) Spatial aliasing. The desired signal is
the S-shaped one in the middle. The emit-
ted frequency is 4000 Hz.
Fig. 5: Spectrograms illustrating artifacts apparent
in the reproduced wave field of a subsonic source
(v= 40 m
/s). The virtual source passes the receiver
at t1 s.
approach with focus on the case of M > 1. The case
of M < 1 is thoroughly treated in [6].
We assume a linear array of secondary monopole
sources. The secondary sources are placed at an in-
terval of ∆x= 0.1 m throughout the simulations.
The loudspeaker array is situated parallel to the x-
axis and symmetrically around the y-axis at y0= 1
m. Its overall length is 14 m except where stated
explicitly.
As inherent to WFS, the reproduced wave field only
approximates the desired one for y > y0. Due to the
fact that we assume secondary monopole sources,
the reproduced wave field on the other side of the
loudspeaker array (where y < y0) is a mirrored ver-
sion.
4.1.Artifacts apparent in the reproduced wave
field
As outlined in [6], the reproduced wave field suf-
fers from two major artifacts: (1) echo-like artifacts
due to spatial truncation of the secondary source ar-
ray, and (2) spatial aliasing when the frequency con-
tent of the reproduced wave field is above the spatial
aliasing frequency. Figure 5 shows spectrograms of
the reproduced wave field observed at xR= [0 4]T.
The loudspeaker array similar to the one used in the
simulations in figure 3, i.e. the loudspeaker array is
situated symmetrically around the y-axis at y0= 1
m and its overall length is 8 m. The loudspeakers
are positioned at intervals of ∆x= 0.1 m
In figure 5(b) a pre- and a post-echo additional to the
desired signal are apparent. The shorter the array
the closer in time to the desired signal the echoes oc-
cur. These truncation artifacts can be significantly
reduced by the application of tapering (i.e. an at-
tenuation of the secondary sources towards the very
ends of the array) [11, 6].
Figure 5(c) depicts the spectrogram of a virtual
source reproduced above the spatial aliasing fre-
quency. For the given array with a loudspeaker spac-
ing of ∆ = 0.1 m the spatial aliasing frequency is
approximately 1700 Hz [13].
Finally, another artifact resulting from spatial trun-
cation of the secondary source distribution is an in-
correct amplitude envelope of the receiver signal.
This circumstance can be observed when comparing
e.g. figures 5(a) and 5(b). At the very ends of the
depicted time window, the receiver signal due to the
real source is significantly higher in amplitude than
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 7 of 11
Ahrens AND Spors Supersonic sources
−4 −2 0
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(a) ℜ{s(x, t0)}
−4 −2 0
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(b) ℜ{pWFS(x, t0)}. No limitation of the temporal band-
width. Strong aliasing artifacts are apparent (see text).
−4 −2 0
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(c) ℜ{pWFS(x, t0)},fmax = 3000 Hz.
−4 −2 0
−1
0
1
2
3
4
5
−1
−0.5
0
0.5
1
x[m]
y[m]
(d) ℜ{pWFS(x, t0)},fmax = 2000 Hz.
Fig. 6: Simulated wave fields of a source oscillating monochromatically at fs= 500 Hz and moving along
the x-axis in positive x-direction at v= 600 m
/s(M1.7). Due to the employment of the complex notation
for time domain signals (see equation (1)), only the real part ℜ{·} of the considered wave field is depicted.
The wave fields have been scaled to have comparable levels. The values of the sound pressure are clipped as
indicated by the colorbars. The loudspeaker array in figures 6(b)-6(d) is indicated by the dotted line. It is
situated symmetrically around the y-axis at y0= 1 m and its overall length is 14 m. The loudspeakers are
positioned at intervals of ∆x= 0.1 m.
the receiver signal due to the virtual source. In the
center of the plot, i.e. when the source is behind the
secondary sources from the receivers point of view,
the amplitude due to the virtual source is similar to
that due to the real source.
4.2.Direct application of the driving function for
M > 1
Figure 6(b) shows a simulation of a WFS system
reproducing the wave field depicted in figure 6(a).
The virtual source moves at v= 600 m
/s, i.e. M
1.7. Due to the omnidirectionality of the secondary
sources, the reproduced wave field in figure 6(b) is
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 8 of 11
Ahrens AND Spors Supersonic sources
3456789
x 10−3
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2x 104
t[s]
f1,2(x1, t)[Hz]
f1
f2
(a) f1,2(x1, t). Negative frequencies indicate time
reversal of the input signal.
3456789
x 10−3
0
5
10
15
20
t[s]
1
Ψ(x1,t)
(b) 1
Ψ(x1,t).
Fig. 7: Details of the wave field of a source of v=
600 m
/s(M1.7) oscillating at fs= 500 Hz ob-
served at x1= [1 1]T. The Mach cone arrives at
t4·103s.
symmetric with respect to the secondary source con-
tour. Note that strong artifacts are apparent. It can
be shown that these artifacts occur due to temporal
as well as spatial aliasing.
This can by verified by analyzing the instantaneous
frequencies f1(t) and f2(t) of the reproduced wave
field components s1(x, t) and s2(x, t). Confer to fig-
ure 7(a). It can be seen that f1(t) and f2(t) are
infinite at the singularity of the Mach cone, i.e. at
the moment of the arrival of the Mach cone. After
the arrival they decrease quickly to moderate values.
The former means that f1(t) and f2(t) will exceed
any limit imposed on a reproduction system due to
discrete treatment of time and discretization of the
secondary source distribution.
4.3.Modified driving function
In order to prevent temporal aliasing in digital sys-
tems due to discretization of the time, it is desirable
to limit the bandwidth of the temporal spectrum of
the driving function. Typical bandwidths in digital
systems are 22050 Hz for systems using a tempo-
ral sampling frequency of 44100 Hz and 24000 Hz
for systems using a temporal sampling frequency of
48000 Hz.
In order to prevent respectively reduce spatial alias-
ing of the WFS system under consideration, it is
desirable to further limit the bandwidth of the tem-
poral spectrum of the driving function to values in
the order of the spatial aliasing frequency which is
typically a few thousand Hertz. Recall that the crit-
ical frequency above which spatial aliasing occurs in
the given secondary source array is approximately
1700 Hz (confer to section 4.1).
A simple means to limit the bandwidth is to sim-
ply fade-in the driving signal from a moment on
when its temporal frequency has dropped below a
given threshold. This strategy also avoids the cir-
cumstance that the amplitude of the driving sig-
nal is infinite at the moment of arrival of the Mach
cone. Real-world implementations of WFS systems
can not reproduce arbitrarily high amplitudes.
Confer to figure 7(b). It depicts the factor Ψ(x, t)1
which determines the amplitude of the wave field
around the Mach cone.
The simulations in figures 6(c) and 6(d) show the
reproduced wave field when the driving function is
faded-in after the instantaneous frequency of the
driving function has dropped below 3000 Hz (figure
6(c)) respectively 2000 Hz (figure 6(d)). The alias-
ing artifacts are significantly reduced.
Note that the shorter the fade-in of the driving func-
tion is the better the impulsive property of the Mach
cone is preserved. However, shorter fade-ins result
in stronger spatial aliasing since they impose more
high frequency content onto a signal.
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 9 of 11
Ahrens AND Spors Supersonic sources
Finally, it has to be considered that spatial aliasing
is not necessarily audible under all circumstances.
5.PERCEPTUAL ASPECTS
Informal listening suggests that the human audi-
tory system is not aware of all the properties of the
wave field of supersonic sources. Especially the fact
that the wave field contains a component carrying
a time-reversed version of the source’s input signal
is confusing. Depending on the specific situation, it
might be preferable to exclusively reproduce s1(x, t),
i.e. the component of the wave field carrying the non-
reversed input signal.
Furthermore, only the localization when exposed to
s1(x, t) is plausible since s1(x, t) assures localization
of the source in its appropriate location (however
with some bias due to the retarded time τ). Expo-
sure of the receiver to s2(x, t) suggests localization
of the source in the direction where it “comes from”.
This also seems unnatural. Finally, the exposure of
the receiver to a superposition of s1(x, t) and s2(x, t)
suggests the localization of two individual sources.
6.CONCLUSIONS
An approach to the reproduction of the wave field
of virtual sound sources moving at supersonic speeds
was presented. The approach constitutes an exten-
sion to a treatment of the reproduction of the wave
field of virtual sound sources moving at subsonic
speeds previously published by the authors. It was
shown that the reproduced wave field suffers from
spatial aliasing artifacts due to the fact that the in-
stantaneous frequency of the virtual sound field is
infinite at the moment of arrival of the Mach cone.
As workaround, it was proposed to fade-in the driv-
ing signal for a given secondary source right after
the instantaneous frequency of the driving signal has
dropped below a desired threshold. A short fade-in
preserves the impulsive quality of the Mach cone.
In order to optimize the reproduction of the sound
field of supersonic virtual sources, it is necessary to
perform preceptive experiments investigating which
properties of the virtual wave field have to be repro-
duced in order to evoke a plausible perception both
in terms of frequency content and localization.
ACKNOWLEDGEMENTS
We thank Holger Waubke of Austrian Academy of
Sciences for providing us with the notes of his lecture
on theoretical acoustics [9].
7.REFERENCES
[1] R. Rabenstein and S. Spors. Multichannel
sound field reproduction. In Benesty, J.,
Sondhi, M., Huang, Y, (Eds.), Springer Hand-
book on Speech Processing and Speech Com-
munication, Springer Verlag, 2007.
[2] C. Doppler. ¨
Uber das farbige Licht der Dop-
pelsterne und einiger anderer Gestirne des
Himmels. In Abhandlungen der k¨oniglichen
ohmischen Gesellschaft der Wissenschaften, 2,
pp. 465–482, 1842.
[3] A. Franck, A. Gr¨afe, T. Korn, and M. Strauß.
Reproduction of moving virtual sound sources
by wave field synthesis: An analysis of artifacts.
32nd Int. Conference of the AES, Hillerød,
Denmark, Sept. 2007.
[4] H. Strauss. Simulation instation¨arer
Schallfelder f¨ur virtuelle auditive Umge-
bungen. Fortschrittberichte VDI 10/652, VDI
Verlag, D¨usseldorf, 2000.
[5] Y. Iwaya and Y. Suzuki. Rendering moving
sound with the doppler effect in sound space.
Applied Acoustics, Technical note, 68:916–922,
2007.
[6] J. Ahrens and S. Spors. Reproduction of mov-
ing virtual sound sources with special attention
to the doppler effect. In 124th Convention of
the AES, Amsterdam, The Netherlands, May
17–20 2008.
[7] N. Epain and E. Friot. Indoor sonic boom repro-
duction using ANC. In Proceedings of Active,
Williamsburg, Virginia, Sep. 20–22 2004.
[8] J.D. Jackson. Classical Electrodynamics. Wiley,
New York, 1975.
[9] H. Waubke. Aufgabenstellung zur Seminarar-
beit zur Vorlesung ”Theoretische Akustik”.
IEM Graz, 2003.
[10] B. Girod, R. Rabenstein, and A. Stenger. Sig-
nals and Systems. J.Wiley & Sons, 2001.
[11] E.W. Start. Direct sound enhancement by wave
field synthesis. PhD thesis, Delft University of
Technology, 1997.
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 10 of 11
Ahrens AND Spors Supersonic sources
[12] S. Spors, R. Rabenstein, and J. Ahrens. The
theory of wave field synthesis revisited. In
124th Convention of the AES, Amsterdam, The
Netherlands, May 17–20 2008.
[13] S. Spors and R. Rabenstein. Spatial aliasing
artifacts produced by linear and circular loud-
speaker arrays used for wave field synthesis. In
120th Convention of the AES, Paris, France,
May 20–23 2006.
AES 125th Convention, San Francisco, CA, USA, 2008 October 2–5
Page 11 of 11
... Physical synthesis of virtual moving sources, either subsonic or supersonic, is possible using wave field synthesis, whose formalism is close to time reversal theory [43,44]. In a previous study [42], the authors of the present study proposed a different approach that better matches the time domain signature of the synthetized acoustic field. ...
Article
Localizing the axis of the Mach cone created by the supersonic displacement of a bullet in a reverberant environment is a challenging task, not only because of the high velocity of the moving source, but also because of the multiple wave reflections off of the walls. Although time reversal (TR) techniques allow static acoustic source localization in a re-verberant space, they have not been explored yet on non stationary waves caused by supersonic displacements in urban canyons. The acoustic wave produced by a supersonic projectile has a conical wavefront and a N-shaped acoustic pressure signature. In this paper, this acoustic wave is reproduced using a line array of point-like sources (simula-tions) and loudspeakers (experiments). During the propagation of this conical wave in an urban canyon, the resulting pressure signals are measured using a time reversal array flush mounted into the ground. These acoustic signals allow to automatically retrieve with a high accuracy the location of the Mach cone axis using time reversal techniques. This inverse problem is solved using the maximization of a fourth-order statistical criterion of the backpropagated pressures. This criterion allows to estimate the intersections between the Mach cone axis and several vertical planes in the urban canyon. These estimations are then fitted to a 3D trajectory with a robust three dimensional interpolation technique based on the Random Sample Consensus (RANSAC) algorithm. This method allows to automatically retrieve the axis of the supersonic source with an angular accuracy of less than 0.5° and a misdistance of 0.5 cm for both numerical simulations and experimental measurements.
... This holds true especially for systems applying measured or pre-simulated head-related impulse responses of the complete room (e.g. the BRS-System [MFT + 99]). Building a physically appropriate model for auralizing moving sound sources in the context of Wave Field Synthesis-Systems (WFS) has been investigated in [AS08a,AS08b,FGKS07]. However, the realization of such a model in a real-time system might be too elaborate for many applications and it might be advantageous to consider only specific cues of the moving sound source. ...
Article
Full-text available
The characteristics of moving sound sources have strong implications on the listener’s distance perception and the estimation of velocity. Modifications of the typical sound emissions as they are currently occurring due to the tendency towards electromobility have an impact on the pedestrian’s safety in road traffic. Thus, investigations of the relevant cues for velocity and distance perception of moving sound sources are not only of interest for the psychoacoustic community, but also for several applications, like e.g. virtual reality, noise pollution and safety aspects of road traffic. This article describes a series of psychoacoustic experiments in this field. Dichotic and diotic stimuli of a set of real-life recordings taken from a passing passenger car and a motorcycle were presented to test subjects who in turn were asked to determine the velocity of the object and its minimal distance from the listener. The results of these psychoacoustic experiments show that the estimated velocity is strongly linked to the object’s distance. Furthermore, it could be shown that binaural cues contribute significantly to the perception of velocity. In a further experiment, it was shown that - independently of the type of the vehicle - the main parameter for distance determination is the maximum sound pressure level at the listener’s position. The article suggests a system architecture for the adequate consideration of moving sound sources in virtual auditory environments. Virtual environments can thus be used to investigate the influence of new vehicle powertrain concepts and the related sound emissions of these vehicles on the pedestrians’ ability to estimate the distance and velocity of moving objects.
... Note however that this means that with all renderers, moving sources are not physically correctly reproduced. The physically correct reproduction of moving virtual sources as in [15,16] requires a different implementation approach which is computationally significantly more costly. ...
Article
Full-text available
The SoundScape Renderer (SSR) comes with ABSOLUTELY NO WARRANTY. The SSR is free software and released under the GNU General Public License, either version 3 of the License, or (at your option) any later version. For de-tails, see the enclosed file COPYING.
Article
Full-text available
In this paper, we outline a basic framework for the reproduction of the wave field of moving virtual sound sources. Conventional implementations usually reproduce moving virtual sources as a sequence of stationary positions. This process leads to various artifacts as reported in the literature. On the example of wave field synthesis, we show that the explicit consideration of the physical properties of the wave field of moving sources avoids these artifacts and allows for the accurate reproduction of the Doppler Effect. However, numerical simulations suggest that the artifacts inherent to the reproduction system can lead to a heavy degradation of the reproduction quality.
Article
Full-text available
Spatial sound reproduction systems with a large number of loudspeakers are increasingly being used. Wave field synthesis is a reproduction system using a large number of densely placed loudspeakers (loudspeaker array). This implies a spatial sampling process that may lead to aliasing artifacts. This paper derives the spatial aliasing artifacts of linear loudspeaker arrays used for wave field synthesis and an anti-aliasing condition.
Article
Full-text available
Wave field synthesis is a spatial sound field reproduction technique aiming at authentic reproduction of auditory scenes. Its theoretical foundation has been developed almost 20 years ago and has been improved considerably since then. Most of the original work on wave field synthesis is restricted to the reproduction in a planar listening area using linear loudspeaker arrays. Extensions like arbitrarily shaped distributions of secondary sources and three-dimensional reproduction in a listening volume have not been discussed in a unified framework so far. This paper revisits the theory of wave field synthesis and presents a unified theoretical framework covering arbitrarily shaped loudspeaker arrays for two-and three-dimensional repro-duction. The paper additionally gives an overview on the artifacts resulting in practical setups and briefly discusses some extensions to the traditional concepts of WFS.
Conference Paper
Wave field synthesis is a spatial audio reproduction concept that is based on the synthesis of a wave field over an extended listening area. The reproduction of moving sound sources is an important feature for many areas of application, for instance theatres, cinemas or virtual reality applications. Although several current WFS systems are able to reproduce moving sound sources, the synthesis causes a number of distinct, audible artifacts. In this article we classify and describe these artifacts, explain their causes and discuss means to reduce audible deviations. The key question is whether the correct reproduction is possible using the theory underlying the current WFS technology. Our investigations suggest that at least one artifact, which is perceived as a spectral broadening of a moving source, cannot be solved sufficiently by the current WFS theory. This states a clear need for further research dealing specifically with the reproduction of moving sources.
Article
In this report, a new rendering method of a moving sound with the Doppler effect is proposed. In the conventional rendering method of moving sound, Head Related Impulse Responses (HRIRs) are simply changed according to a sound position. However, the Doppler effect cannot be added to a sound in this method. The pitch of a sound object must be controlled using some other rendering method when a sound object moves at high speed. In our method, each HRIR is divided into two components, such as an initial delay and a main wave form. Two initial delays of both right and left ears are recalculated, respectively, based on relative speeds and a propagation path. These new initial delays are used in rendering. Therefore, the Doppler effect is added to a sound automatically only when a sound position is set in this algorithm. Details related to this algorithm are discussed in this report.
Chapter
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
Article
The European programs for development of supersonic air-flights involve new studies on the human perception of sonic boom. Because this noise includes high-level components at very low-frequency, the usual psycho-acoustic tests with headphones are not relevant; instead, the original sound-field can be reproduced with many loudspeakers in a small room, but the loudspeakers must be controlled for an accurate reproduction, both in time and space, in an area large enough to enclose a listener's head. In this paper, Active Noise Control is applied to sonic boom reproduction through Boundary Surface Control (as named by S.Ise) of the acoustic pressure around a listener. A small room was built at LMA with sixteen powerful low-frequency acoustic sources in the walls. Frequency and time-domain numerical simulations of sonic boom reproduction in this room are given, including a sensitivity study of the coupling between a listener's head and the incident sonic boom wave which combine into the effective sound-field to be reproduced.
Aufgabenstellung zur Seminarar-beit zur Vorlesung ”Theoretische Akustik”
  • H Waubke
H. Waubke. Aufgabenstellung zur Seminarar-beit zur Vorlesung ”Theoretische Akustik”. IEM Graz, 2003.