Content uploaded by Matthias Frank
Author content
All content in this area was uploaded by Matthias Frank on Apr 07, 2025
Content may be subject to copyright.
Spatial Resolution of Late Reverberation in Ambisonic Loudspeaker Playback
Matthias Frank1, Lukas G¨olles1, Stefan Riedel1
1Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Austria
Email: frank@iem.at, goelles@iem.at, riedel@iem.at
Introduction
Convolution of dry audio signals with multichannel room
impulse responses (RIR) is typically used to recreate the
spatial impression of a room. These RIRs are generated
either through room acoustic simulation or measurement
of the respective room. In real-time applications, the
computational effort of convolutions increases with both
the length of the RIR and its number of channels. It thus
seems beneficial to reduce the number of channels espe-
cially in the diffuse, later part of RIRs. In Ambisonics [1],
the number of channels is directly related to the spatial
resolution, the so-called Ambisonics order. The question
arises which order is perceptually necessary for late rever-
beration. This question is especially interesting, as new
algorithms for the enhancement of measured Ambisonic
RIRs [2] employ a salient/diffuse separation [3] that al-
lows to render both components with different spatial
resolution.
In binaural Ambisonics rendering for headphones, au-
thentic playback of the direct sound might require orders
>15 [4], i.e. more than 256 channels. State-of-the-art
psychoacoustically-optimized rendering approaches, such
as the magnitude least-squares approach require far less
resolution [5, 6]. For plausible playback with 6 degrees
of freedom, even 3rd order might be enough [7]. Simi-
lar minimum required orders were found for everything
after the direct sound by Engel et al. [8] for RIRs mea-
sured with a 32-channel compact spherical microphone
array. Focusing on diffuse sound only, experiments in [9]
required 12 to 20 binaurally virtualized loudspeakers on
a sphere, i.e. orders between 2 and 3. Using the magni-
tude least-squares approach for diffuse, isotropic rever-
beration revealed that 1st order was perceptually suffi-
cient in binaural rendering due to the employed covari-
ance constrained [10].
However, such solutions are not applicable to loudspeaker
playback. For simulated late reverberation, the study
in [11] could reduce the number of loudspeakers on a
sphere from 86 to 12 or 24 without perceptual effect.
Other experiments showed that playback of uncorrelated
noise signals was perceptually close to a ‘diffuse’ 24-
loudspeaker ring using at least 8 surrounding loudspeak-
ers [12]. Evaluating the sensation of envelopment, a
study using uncorrelated broadband pink noises revealed
that a ring of 4 loudspeakers was not enough [13] and
that more than 8 loudspeakers did not increase similar-
ity to a 24-loudspeaker ring [14].
All of these loudspeaker-based experiments varied the
number of active loudspeakers, i.e. the angular distance
of regularly distributed loudspeakers on spheres or cir-
cles. It is not clear if reducing the spatial resolution, i.e.
the Ambisonics order, while using the same number of ac-
tive loudspeakers is perceptually comparable. Moreover,
existing experiments were often limited to loudspeaker
arrangements in anechoic chambers and did not examine
the effect of absent or present direct sound.
This contribution evaluates the perceptually required
spatial resolution of late reverberation in loudspeaker
playback for both isotropic and anisotropic distributions
of reverberation with and without direct sound. The ex-
periment compares lower-order playback to a 5th-order
reference and employs the same loudspeaker arrangement
in an anechoic chamber and a small studio.
The paper first describes the setup and conditions of the
experiment. Then it presents the experimental results
and compares the minimum required orders to literature.
The paper is finally summarized and ideas for future work
are pointed out.
Setup and Conditions
The experiment examined the perceived similarity of late
reverberation with reduced spatial resolution in compar-
ison to a high-resolution reference. Playback was con-
ducted using a horizontal ring of 12 Genelec 8020 loud-
speakers with a radius of 1.3 m. The same loudspeaker
arrangement was used in the anechoic chamber and a
small studio with a size of 27 m2with a reverberation
time of 0.25 s. The spatial resolution was adjusted by
varying the Ambisonics order in circular harmonics. A
5th-order Ambisonics decoding served as the reference,
which was compared against 0th-, 1st-, 2nd-, and 3rd-
order renderings. Regardless of the playback order, de-
coding used all 12 loudspeakers and the sampling ap-
proach without additional weighting. To ensure consis-
tent spectral balance and level at the central listening
position, order-dependent equalization was applied.
The late reverberation was created using the FdnReverb,
a feedback delay network [15, 16] plug-in 1. The reverb
was configured with a 64×64 matrix, a room size of 10,
and spectral shaping filters. A low-shelf (Q = 0.5, 6 dB
at 100 Hz) increased reverberation time at low frequen-
cies, while a high-shelf filter (Q= 0.5, -60 dB at 11 kHz)
reduced reverberation at high frequencies. To enhance
the naturalness of the reverberation, an additional filter
was applied, consisting of a 4th-order low cut at 100 Hz,
high shelf with -8 dB at 2 kHz and a 4th-order high cut
at 8 kHz. In the FdnRerverb plug-in, the output from
a second network with identical parameters but shorter
reverberation time is subtracted to suppress the early,
non-diffuse output, creating a fade-in [17]. The fade-in
parameter was set to 10% of the reverberation time.
1freely available as part of the IEM Plug-in Suite: https://
plugins.iem.at/
DAS/DAGA 2025 Copenhagen
1
For the isotropic reverb, the reverberation time was set
to 2.0 s for mid frequencies, and the 64 outputs were
randomly distributed to 64 equally spaced directions on
a circle. The anisotropic reverberation simulated a re-
ceiver positioned in a doorway between two rooms with
differing reverberation times. Thus, it used two differ-
ent networks, with 3.0 s and 1.5 s reverberation time,
respectively. On the left, 32 outputs from the longer
reverb were randomly assigned to 32 equally spaced di-
rections on the left semicircle, and on the right, the same
procedure was done for the shorter reverb. To maintain
diffusivity across all playback orders, the number of re-
verberation directions remained constant [18].
Both reverberation conditions were tested with and with-
out direct sound. In the direct sound condition, a single
frontal loudspeaker was used, with its level adjusted to
achieve a direct-to-reverberant energy ratio of 0 dB. As
a signal, the first sentence of EBU’s male speech sample
was employed [19]. The listening experiment was per-
formed as a MUSHRA-like [20] multi-stimulus similarity
comparison of 5 conditions (playback at orders 0, 1, 2,
3, and 5 as hidden reference) to the 5th-order reference.
For each listening room, there were 4 trials: isotropic and
anisotropic reverb, both with and without frontal direct
sound. Head movements were allowed to increase sensi-
tivity.
There were 23 participants (7 female) in the experiment
with an average age of 30 years (min 23, max 50). On av-
erage, it took them 7.5 minutes in each room to complete
the experiment.
Results
Figure 1 shows median values and confidence intervals
of the similarity ratings. Obviously, similarity increases
with the order. Bonferroni-Holm corrected p-values
of a Wilcoxon signed-rank test were used to check for
significant differences in the ratings for the reduced
spatial resolutions compared to the 5th-order references,
reported alongside Cliff’s δas an indicator for effect size.
For the isotropic reverberation in the anechoic cham-
ber, significant differences were observed between the
reference and orders 0, 1, 2, and 3 (p≤0.02) with large
effect sizes (δ≥0.478) in the absence of direct sound.
53210
Playback order
0
0.2
0.4
0.6
0.8
1
Similarity to 5th-order reference
isotropic
isotropic + direct sound
(a) anechoic: isotropic reverb pure and with direct sound
53210
Playback order
0
0.2
0.4
0.6
0.8
1
Similarity to 5th-order reference
isotropic
isotropic + direct sound
(b) studio: isotropic reverb pure and with direct sound
53210
Playback order
0
0.2
0.4
0.6
0.8
1
Similarity to 5th-order reference
anisotropic
anisotropic + direct sound
(c) anechoic: anisotropic reverb pure and with direct
sound
53210
Playback order
0
0.2
0.4
0.6
0.8
1
Similarity to 5th-order reference
anisotropic
anisotropic + direct sound
(d) studio: anisotropic reverb pure and with direct sound
Figure 1: Median values and 95% confidence intervals of rated similarity to 5th-order reference in the anechoic chamber and
the studio.
DAS/DAGA 2025 Copenhagen
2
When direct sound was present, significant differences
(p≤0.007) and large effects (δ≥0.535) were found up
to the 2nd order. For the anisotropic reverberation in
the same environment, significant differences from the
reference were found up to the 2nd order (p≤0.009)
with large effects (δ≥0.785) without direct sound, and
up to the 2nd order with direct sound (p≤0.013) with
large effects (δ≥0.473).
In the studio, for the isotropic reverb, there were
significant differences to the reference up to the 3rd
order (p≤0.01) and large effects (δ≥0.492) without
direct sound, whereas with direct sound, there were
only significant differences (p≤0.007) and large effects
(δ≥0.580) until the 2nd order.
For the anisotropic reverb in the studio, there were
significant differences to the reference until the 3rd order
(p≤0.015) and a medium effect (δ≥0.401) without
direct sound, while a reduction to the 2nd order was
required for a large effect (δ= 0.510). With direct
sound, there were significant differences (p≤0.018) and
medium effects (δ≥0.450) up to the 2nd order, while
again, a reduction to the 1st order was required for a
large effect (δ= 0.60).
Discussion
To summarize the above results and make them practi-
cally applicable, it is helpful to determine the minimum
required order that provides just enough resolution to be
indistinguishable from the reference. Thus, the minimum
required order is the lowest order that did not exhibit a
significant difference to the 5th-order reference and no
large effect δ < 0.47 [21]. These orders are summarized
in Table 1 for each reverb setting and playback room.
In both rooms, the highest tested order was required for
the isotropic reverberation without direct sound. As ex-
pected, additional direct sound reduced the sensitivity,
resulting in 3rd order playback as minimum requirement.
Interestingly, the anisotropic reverb required only a reso-
lution of 3rd order without direct sound and only 2nd or-
der with direct sound in the studio. The results for both
playback rooms were the same, except for the anisotropic
reverb with direct sound.
Our minimum required orders of 3 and 2 for playback
with direct sound agree with the findings for headphone
playback in [8] and for loudspeakers in [11], when con-
verting the number of required loudspeakers between 12
and 24 to equivalent spherical harmonic orders. They
are also similar to the saturation of envelopment that
occurred for a ring of 8 loudspeakers in [14].
Table 1: Minimum required orders to be indistinguishable
from 5th-order reference.
isotropic anisotropic
pure + DS pure + DS
anechoic chamber 5 3 3 3
studio 5 3 3 2
Conclusion
This contribution presented an experiment to determine
the perceptually required spatial resolution of late rever-
beration in Ambisonic loudspeaker playback at the cen-
tral listening position. The experiment was conducted
using the same horizontal loudspeaker setup in two envi-
ronments: an anechoic chamber and a small studio room.
The late reverberation was created by feedback delay net-
works and tested under two conditions; an isotropic con-
dition and an anisotropic condition with a shorter rever-
beration time on the right side. Both reverb conditions
were excited by speech and tested with and without di-
rect sound. In the experiment, the spatial resolution
was defined by Ambisonics orders and the reduced or-
ders were compared to a 5th-order reference.
The results did not reveal any practically relevant differ-
ences between the anechoic chamber and the small stu-
dio regarding the minimum order required to be percep-
tually indistinguishable from the reference. In general,
the presence of direct sound reduced the required Am-
bisonics order. Moreover, less resolution was required for
anisotropic reverberation compared to isotropic reverber-
ation. In practice, 3rd-order rendering of late reverbera-
tion appears sufficient for loudspeaker playback in both
listening environments at the central listening position.
This is also relevant for 3D audio content production,
as both ADM and MPGE-H as production and trans-
mission formats support at least 3rd-order Ambisonics
in addition to direct sound objects.
Future work should investigate the required spatial res-
olution of reverberation at off-center listening positions
and in larger (studio) listening spaces. Additionally, it
should extend current findings by investigating the effect
of state-of-the-art Ambisonic headphone decoding on the
perception of isotropic and anisotropic reverberation.
References
[1] F. Zotter and M. Frank, Ambisonics, ser. Springer
Topics in Signal Processing. Springer, 2019.
[2] L. G¨olles and M. Frank, “Ambisonic spatial decom-
position method with salient / diffuse separation,”
in Proceedings of the 158th AES Convention, War-
saw, Poland, may 2025.
[3] T. Deppisch, S. V. A. Gar´ı, P. Calamia, and
J. Ahrens, “Direct and residual subspace decomposi-
tion of spatial room impulse responses,” IEEE/ACM
Transactions on Audio, Speech, and Language Pro-
cessing, vol. 31, pp. 927–942, 2023.
[4] B. Bernsch¨utz, A. V. Giner, C. P¨orschmann, and
J. Arend, “Binaural reproduction of plane waves
with reduced modal order,” Acta Acustica united
with Acustica, vol. 100, no. 5, pp. 972–983, 2014.
[5] M. Zaunschirm, C. Sch¨orkhuber, and R. H¨oldrich,
“Binaural rendering of Ambisonic signals by head-
related impulse response time alignment and a dif-
fuseness constraint,” J. Acoust. Soc. Am., vol. 143,
no. 6, pp. 3616–3627, 2018.
DAS/DAGA 2025 Copenhagen
3
[6] C. Sch¨orkhuber, M. Zaunschirm, and R. H¨oldrich,
“Binaural Rendering of Ambisonic Signals via Mag-
nitude Least Squares,” in Fortschritte der Akustik -
DAGA, Munich, March 2018.
[7] K. Enge, M. Frank, and R. H¨oldrich, “Listening ex-
periment on the plausibility of acoustic modeling in
virtual reality,” in Fortschritte der Akustik-DAGA,
2020.
[8] I. Engel, C. Henry, S. V. Amengual Gar´ı, P. W.
Robinson, and L. Picinali, “Perceptual implications
of different ambisonics-based methods for binaural
reverberation,” J. Acoust. Soc. Am., vol. 149, no. 2,
pp. 895–910, 2021.
[9] M.-V. Laitinen and V. Pulkki, “Binaural reproduc-
tion for directional audio coding,” in 2009 IEEE
Workshop on Applications of Signal Processing to
Audio and Acoustics, 2009, pp. 337–340.
[10] D. Perinovic and M. Frank, “Spatial resolution of
diffuse reverberation in binaural ambisonic play-
back,” Fortschritte der Akustik, DAGA, pp. 1622–
1624, 2021.
[11] C. Kirsch, J. Poppitz, T. Wendt, S. van de Par, and
S. D. Ewert, “Spatial resolution of late reverberation
in virtual acoustic environments,” Trends in Hear-
ing, vol. 25, p. 23312165211054924, 2021.
[12] K. Hiyama, S. Komiyama, and K. Hamasaki, “The
minimum number of loudspeakers and its arrange-
ment for reproducing the spatial impression of dif-
fuse sound field,” in 113th Convention of the Audio
Engineering Society, 2002, paper 5696.
[13] S. Riedel et al., “The effect of temporal and direc-
tional density on listener envelopment,” J. Audio
Eng. Soc., vol. 71, no. 7/8, pp. 455–467, 2023.
[14] S. Riedel and F. Zotter, “Surrounding line sources
optimally reproduce diffuse envelopment at off-
center listening positions,” JASA Express Letters,
vol. 2, no. 9, 2022.
[15] J. Stautner and M. Puckette, “Designing Multi-
Channel Reverberators,” Computer Music Journal,
vol. 6, no. 1, pp. 52–65, 1982.
[16] J.-M. Jot and A. Chaigne, “Digital delay networks
for designing artificial reverberators,” in 90th AES
Conv., prepr. 3030, Paris, February 1991.
[17] N. Meyer-Kahlen, S. J. Schlecht, T. Lokki et al.,
“Fade-in control for feedback delay networks,” in
International Conference on Digital Audio Effects,
2020.
[18] M. Frank and M. Brandner, “Perceptual Evalua-
tion of Spatial Resolution in Directivity Patterns,”
in Fortschritte der Akustik, DAGA, Rostock, Mar.
2019.
[19] EBU, “EBU SQAM CD: Sound Quality
Assessment Material recordings for subjec-
tive tests,” 2008. [Online]. Available: https:
//tech.ebu.ch/publications/sqamcd
[20] ITU-R, “Bs.1534: Method for the subjective
assessment of intermediate quality levels of coding
systems,” Tech. Rep., 2015. [Online]. Available:
itu.int/rec/R-REC-BS.1534
[21] J. Romano, J. D. Kromrey, J. Coraggio, and
J. Skowronek, “Appropriate statistics for ordinal
level data: Should we really be using t-test and
cohen’s d for evaluating group differences on the
nsse and other surveys,” in Annual meeting of the
Florida Association of Institutional Research, vol.
177, no. 34, 2006.
DAS/DAGA 2025 Copenhagen
4