Content uploaded by Sebastian Schlecht
Author content
All content in this area was uploaded by Sebastian Schlecht on Oct 28, 2019
Content may be subject to copyright.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
DENSE REVERBERATION WITH DELAY FEEDBACK MATRICES
Sebastian J. Schlecht and Emanuël A.P. Habets
International Audio Laboratories Erlangen∗, Germany
sebastian.schlecht@audiolabs-erlangen.de
ABSTRACT
Feedback delay networks (FDNs) belong to a general class of re-
cursive filters which are widely used in artificial reverberation and
decorrelation applications. One central challenge in the design of
FDNs is the generation of sufficient echo density in the impulse
response without compromising the computational efficiency. In a
previous contribution, we have demonstrated that the echo density
of an FDN grows polynomially over time, and that the growth de-
pends on the number and lengths of the delays. In this work, we
introduce so-called delay feedback matrices (DFMs) where each
matrix entry is a scalar gain and a delay. While the computational
complexity of DFMs is similar to a scalar-only feedback matrix,
we show that the echo density grows significantly faster over time,
however, at the cost of non-uniform modal decays.
Index Terms—Feedback Delay Network, Artificial Reverber-
ation, Echo Density
1. INTRODUCTION
If sound is emitted in a room, the sound waves travel across the
space and are repeatedly reflected at the room boundaries resulting
in acoustic reverberation [1]. If the sound is reflected at a smooth
boundary the reflection is coherent (specular), while it is incoher-
ent (scattered) when reflected by a rough surface. In small to large
rooms, the mean free path of specular reflections is between 10 -
100 ms, whereas the time scale of incoherent reflections is a dense
response of a few milliseconds [2]. In geometric room acoustic, an
incoherent reflection can be effectively generated from a set of close
image sources [2], see Fig. 1. Consequently, scattering increases the
total number of image sources and therefore the echo density [3]. In
this work, we propose a method to effectively introduce scattering-
like effects for artificial reverberation filter structure.
Many artificial reverberators have been developed in recent
years [4] among which the feedback delay network (FDN), orig-
inally proposed by Gerzon [5] and further developed by Jot and
Chaigne [6], is one of the most popular. The FDN consists of N
delay lines combined with attenuation filters which are fed back
via a scalar feedback matrix A. To maximize the echo density,
we assume throughout this work that feedback matrix Ais dense,
i.e., all matrix entries are non-zero [7]. In [3], the present authors
demonstrated that the absolute echo density of an FDN grows poly-
nomially with the degree equal to the number of delay lines Nand
inversely to the delay lengths, i.e., shorter delay lines produce faster
increasing absolute echo density. A major challenge of FDN design
is to achieve sufficient echo density because of an inherent trade-
off between three aspects: computational complexity, mode density
∗The International Audio Laboratories Erlangen are a joint institution of
the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraun-
hofer Institut für Integrierte Schaltungen IIS.
Source
Receiver
Wall 1
Wall 2
First order
image sources
Second order
image sources
Figure 1: Second order reflections at two rough surfaces represented
with an equivalent set of image sources (inspired by [2]). The en-
ergy of the image sources is indicated by the color saturation.
and echo density. A higher number of delays increases both modal
and echo density, but also the computational complexity. With a
fixed number delays, shorter delays increases the echo density, how-
ever, results simultaneously in a lower modal density [6].
To reproduce a scattering-like effect, we require a set of long
delays which are proportional to the mean free path [8], and a set of
filters to add the short-term density to each reflection (see Fig. 1).
Feedforward-feedback allpass filters have been introduced in series
to increase the short-term echo density [9, 10]. Alternatively, all-
pass filters may be placed after the delay lines [11] which in turn
doubles the effective size of the FDN [12]. Instead of allpass filters,
FIR filters with pseudo-random exponentially decaying coefficients
were proposed [13].
In this contribution, we propose an FDN with a delay feedback
matrix (DFM), i.e., each matrix entry has a delay and a gain. We
demonstrate that the DFM can increase the echo density signifi-
cantly without introducing further filtering. In Section 2, we present
the new filter structure and compare it to the standard FDN. In Sec-
tion 3, we demonstrate the echo density of the proposed FDN. In
Section 4, an example structure is presented.
2. PROPOSED FEEDBACK DELAY NETWORK
In the following, we present the proposed extension to the FDN and
explain the stability conditions.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
A(z)
X(z)b1
z−m1
c1Y(z)
b2
z−m2
c2
b3
z−m3
c3
Figure 2: Delay feedback delay network with three delays (N= 3)
and a delay feedback matrix A(z)instead of the standard scalar
feedback matrix A.
2.1. Delay Feedback Matrix
The proposed FDN is similarly structured as a standard FDN, but
employs a delay feedback matrix (DFM) A(z)instead of a scalar
feedback matrix A(see Fig. 2). The transfer function of the pro-
posed FDN is
H(z) = c>[Dm(z−1)−A(z)]−1b+d, (1)
where the column vectors band cof bi’s and ci’s, re-
spectively. The lengths of the Ndelays in samples are
given by m= [m1,...,mN]∈NN.Dm(z) =
diag[z−m1, z−m2,...,z−mN]is the diagonal N×Ndelay ma-
trix. The attenuation filters which are usually applied in the feed-
back loop are omitted for clarity and can be added as an additional
processing unit in series with the delay lines [6]. The N×NDFM
A(z)is given by
A(z) = U◦DM(z),(2)
where ◦denotes the Hadamard (element-wise) product, Uis a
scalar matrix and Mare the non-negative integer matrix delays and
the matrix entries of the delay matrix DM(z)are [DM(z)]ij =
z−Mij where 1≤i, j ≤N. If all matrix delays are zero, i.e.,
M=0, the DFM becomes the standard scalar feedback matrix.
For this reason, we denote the zero matrix delay with MS=0.
From a design perspective, the matrix delays of the DFM M
should be relatively small compared to the main delays m. This
is because the attenuation filters are typically proportional to the
main delays and every deviation of this proportionality by the DFM
distorts the reverberation time specification. To achieve a scattering-
like effect, the matrix delays Mare set to less than 10% of the
corresponding main delays m(see Section 3).
The implementation of the DFM is similar to the scalar feed-
back matrix. The main difference is that the signal vector is re-
trieved by N2memory access operations instead of N. All other
arithmetic operations, i.e., element-wise multiplication with Uand
the writing operations to the delay lines, are unaltered. Conse-
quently, the additional computational complexity introduced by the
DFM is relatively small.
2.2. Stability
The poles of the proposed FDN are the eigenvalues of the polyno-
mial matrix P(z) = Dm(z−1)−A(z), which in turn are the roots
of the polynomial p(z) = det(P(z)) [14]. The polynomial degree
of p(z)is then equal to the total number of system poles of the pro-
posed FDN. FDNs are commonly designed as lossless systems, i.e.,
all system poles lie on the unit circle, to allow precise control over
the reverberation time by introducing additional proportional atten-
uation filters [6]. The lossless property of general unitary-networks
which in particular applies to the proposed FDN, was described by
Gerzon [15]. A sufficient lossless condition is given by A(z)being
paraunitary, i.e., A(z−1)>A(z) = Ifor real coefficients, where I
is the identity matrix [15].
Consequently, an FDN with a scalar feedback matrix, i.e.,
MS=0, is lossless if Uis unitary. Further, we show that A(z)is
paraunitary, if
MP
ij =mout
i+min
jfor 1≤i, j ≤N, (3)
where mout,min ∈NNare vectors of integer delays in samples
and we denote MP
ij as the matrix elements of matrix delays MP.
From close inspection, it can be observed that
A(z) = U◦DMP(z) = Dmin (z)U Dmout (z).(4)
As paraunitary matrices are closed under multiplication and
Dmin (z)and Dmout (z)are paraunitary, we can conclude that
A(z)is paraunitary, if Uis unitary.
Unlike the scalar case MSand the paraunitary case MP,A(z)
is not necessarily paraunitary for other matrix delays MNP such
that no lossless FDN might exist for such matrix delays. In the
following, we show a sufficient condition for the proposed FDN to
be stable instead. To establish the stability condition, we recall a
version of Rouché’s theorem for polynomial matrices:
Theorem 1 (Rouché, [16]).Let Q(z)and R(z)be matrix polyno-
mials. Assume that Q(z)is invertible on the simple closed curve
γ⊂G. If kQ(z)−1R(z)k2<1for all z∈γ, then Q(z) + R(z)
and Q(z)have the same number of eigenvalues inside γ, counting
multiplicities.
With this theorem, we can prove the following stability condi-
tion:
Theorem 2. The proposed FDN in (1) is stable if
kA(eıω)k2<1for all ω . (5)
Proof. So, let us assign Q(z) = Dm(z−1)and R(z) = −A(z)
such that P(z) = Q(z) + R(z). Further, let us choose the unit cir-
cle as the closed curve γ. Now, all the eigenvalues of Q(z)are zero
and therefore within γ. For z∈γ, we have z=eıω and Q(eıω)
is unitary for all ωsuch that kQ(eıω)−1R(eıω )k2=kR(eıω )k2.
Thus according to Theorem 1, all eigenvalues of P(z)are within
the unit circle if kR(eıω)k2<1.
Theorem 2 provides also practical means to stabilize any non-
stable FDN by dividing A(z)by maxωkA(eıω)k2. Nonetheless,
the missing lossless property makes the non-paraunitary DFM more
tuning-intensive to avoid unwanted ringing modes as the modal de-
cays are possibly non-uniform. While there might be possible op-
timization schemes to improve the modal decay distribution, in this
pioneering study, we choose the non-paraunitary DFM empirically.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
m1
m2
time n
(a) Standard FDN with scalar feedback matrix.
m1
m2
time n
(b) Proposed FDN with a paraunitary DFM.
m1
m2
time n
(c) Proposed FDN with a non-paraunitary DFM.
Figure 3: Echo lattice FDNs with two delay lines and three different DFMs: scalar MS, paraunitary MPand non-paraunitary MNP . Each
dot indicates a single echo. The x and y coordinates indicate the time component associated with the first and second delay line, respectively,
m1and m2. The echo time nis indicated by the diagonal dotted line. The matrix delays Mare added diagonally. The red arrows indicate
the three paths p∈ {[1,2,2],[2,1,2],[2,1,1]}. More details regarding the echo lattice can be found in [3].
3. ECHO DENSITY
In this section, we present the echo density1of the proposed FDN
and show that it grows significantly faster with reflection order for
paraunitary and non-paraunitary DFMs.
3.1. Absolute Echo Density
The absolute echo density, i.e., the number of echoes at the output
per time unit, of the standard FDN is polynomial over time and
the exact polynomial is given in [3]. Echoes are closely related
to echo paths pdescribing the sequence of delay lines which are
traversed by an impulse input until it reaches the output. Formally,
an echo path can be defined as p∈ {1,2,...,N}lof length lwhich
denotes the number of involved delay lines. The echo time npis
defined as the time at which the echo appears at the output of the
FDN. In standard FDN, the echo time npof path pis given by
np=Pl
i=1 mpi[3], whereas the more general version for the
proposed FDN is
np=
l
X
i=1
mpi+
l−1
X
i=1
Mpi+1,pi.(6)
We rewrite (6) in terms of the delay count
[cp]i=|{k|pk=i}| for 1≤i≤N(7)
and the matrix routing count
[Cp]ij =|{k|pk=jand pk+1 =i}| for 1≤i, j ≤N, (8)
where |·| denotes the cardinality of a set. Hence, the delay and ma-
trix routing counts denote the number of occurrences of delay lines
and matrix delays in an echo path p, respectively. These counts sim-
plify the following development as they fully determine the echo
time np. In fact, the echo time of path pmay be expressed by
np=
N
X
i=1
[cp◦m]i+
N
X
i,j=1
[Cp◦M]ij .(9)
1Although echo typically refers to a subjectively distinct reflection [1],
we refer to a non-zero output of the FDN as an echo. This definition is in
accordance to the term echo density in the literature [17, 3].
It is crucial to observe that the echo time npfor a scalar feed-
back matrix (MS=0) is invariant under permutations of pbe-
cause cpis invariant under permutations. The number of paths pof
length lwith different echo times npis
ES(l) = N+l−1
N−1!.(10)
This quantity is a special case of the equilateral approximation given
in [3]. For a paraunitary DFM with MPas in (3) with mout and
min, we have
MP
pi+1,pi=mout
pi+1,+min
pi(11)
such that npis invariant under permutation except for the first and
last element. Thus, the number of paths pof length l≥2with
different echo times npis
EP(l) = N2 N+l−3
N−1!.(12)
Because of the bound on the binomial coeffcients q
r≤qr
r!, the
number of paths ES(l)and EP(l)grows polynomially with length
l. For a non-paraunitary DFM with properly chosen matrix de-
lays MNP, the echo time npis different for each permutation of
pwhere the contribution of MNP is different. More precisely,
the echo times of two paths pand qare equal if cp=cqand
Cp=Cq. The number of paths ENP(l)is difficult to give in
closed form. In fact, ENP(l)is equal to the number of 2-abelian
equivalence classes of words of length lover an alphabet of size N
[18]. In [19, Theorem 5.1], Cassaigne et al. recently showed that
ENP(l)is asymptotically equal to r lN(N−1) ,(13)
where ris a rational constant depending on N. Thus, ENP (l)grows
polynomially, however with degree N(N−1) instead of N. Fig. 3
illustrates the echo times of an FDNs with different DFMs in the
echo lattice. In the following, we present a quantitative analysis of
the absolute echo densities depending on the feedback matrix.
The derivation of a closed form expression of the absolute echo
density requires the usages of sophisticated methods (see [3]) which
is beyond the scope for this work. However, if we neglect the spe-
cific delay lengths, the number of echo paths with discrete echo
times can be related to the absolute echo density. A first approx-
imation of the absolute echo density can be given by scaling the
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
0 500 1,000 1,500 2,000 2,500 3,000
0
2
4
6
Time [ms]
Amplitude and Echo Density [lin]
SP NP
Figure 4: Echo density profile and impulse response of the proposed
FDN with three different DFMs: scalar (S), paraunitary (P), and,
non-paraunitary (NP) and damping factor γ. The solid line indicates
the impulse response, whereas the dashed line is the measured echo
density profile. The three plots are offset for better readability by 2.
number of echo paths with the geometric mean of the main delay
lengths QN
i=1 m1/N
i.
Similar to the standard FDN case, the main delays mand delay
matrix Mshould be chosen such that as few as possible echo times
npcoincide. Naturally, this condition is only satisfiable in the early
part of the impulse response and that is only where it mostly matters
from a perceptual point of view [3].
3.2. Echo Density Profile
Abel and Huang proposed in [17, 20] the echo density profile. The
underlying assumption is that the sound pressure amplitudes in a
reverberant sound field exhibit a Gaussian distribution. With a short
sliding rectangular window of 23 ms, the empirical standard de-
viation of impulse response amplitudes is calculated. To deter-
mine how well the empirical amplitude distribution approximates
a Gaussian behavior, the proportion of samples outside the empir-
ical standard deviation is determined and compared to the propor-
tion expected for a Gaussian distribution. With increasing time and
increasing echo density, the echo density profile increases reaches
one for a fully dense frame of the impulse response. The perceptual
validity was confirmed in [21].
4. EXAMPLES
We give an example of the proposed FDN and compare it to a
standard FDN. The number of delay lines N= 4, and the delays
are m= [15805,5001,9535,7201] samples, with a sampling fre-
quency of 48 kHz. The delays are chosen extremely high on pur-
pose to make the difference in echo density both easily audible and
visible. Nonetheless, the echo density is affected equivalently for
proportionally shorter delays. The matrix gains are an orthogonal
Hadamard matrix
U=
0.5 0.5 0.5 0.5
0.5−0.5 0.5−0.5
0.5 0.5−0.5−0.5
0.5−0.5−0.5 0.5
(14)
Table 1: Number of paths with different echo times for N= 4 and
the three different delay matrices MS,MP, and, MNP .
l2345678
ES(l)10 20 35 56 84 120 165
EP(l)16 64 160 320 560 896 1344
ENP(l)16 64 244 856 2728 7892 20876
and input band output gains care all 1. The delay matrices are
MP=
456 1 10 447
751 296 305 742
511 56 65 502
647 192 201 638
,
MNP =
963 950 556 770
139 858 489 21
286 3 773 137
610 525 162 117
.
We have introduced a reverberation time of approximately 3 s by
replacing each delay element z−1by γz−1, where γ= 0.99995
for the scalar and paraunitary case, and γ= 0.99992 for the non-
paraunitary case to guarantee stability according to Theorem 2. Ta-
ble 1 shows the number of paths with different echo times for the
three DFM types: ES(l),EP(l), and, ENP(l). As shown in Sec-
tion 3, the number of echoes produced by the paraunitary and non-
paraunitary DFMs are significantly higher than for the scalar DFM.
Fig. 4 shows the impulse responses and the corresponding echo
density profiles. As predicted by the number of echo paths in Sec-
tion 3, the echo density profile of the impulse responses increases
from the scalar (S) to the paraunitary (P), and further to the non-
paraunitary (NP) feedback matrix. For instance, the echo density
profile at time 1.5 s in this order is 0.05, 0.25, and, 0.82, respec-
tively. For illustration, we have synthesized audio examples from
the three DFMs depicted in Fig. 4 and provided them online2. It is
easily audible that the impulse responses of (P) and (NP) become
dense much more quickly than for (S). However, at the end of the
impulse response of (NP), slight metallic ringing is perceivable, due
to the non-uniform distribution of modal decays caused by the non-
lossless property discussed in Section 2.
5. CONCLUSION
In this work, we propose an extension to the feedback delay net-
work (FDN) by replacing the scalar feedback matrix with a delay
feedback matrix (DFM), which introduces a different delay for each
feedback matrix entry. Based on the stability property, we present
two types of DFMs: paraunitary and non-paraunitary. The echo
density of FDNs with such DFMs is discussed in terms of the num-
ber of echo paths of increasing length. Alternatively, we analyze
the perceived echo density of the impulse responses with the echo
density profile. In an example, we show that both proposed DFMs
increases the echo density significantly. While the non-paraunitary
DFM surpasses the echo density of the paraunitary DFM, it also in-
troduces metallic ringing due to non-uniform modal decays which
impedes the tuning for high-quality artificial reverberation.
2https://www.audiolabs-erlangen.de/resources/
2019-WASPAA- DFM-FDN/
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
6. REFERENCES
[1] H. Kuttruff, Room Acoustics, Fifth Edition. CRC Press, June
2009.
[2] S. Siltanen, T. Lokki, S. Tervo, and L. Savioja, “Modeling
incoherent reflections from rough room surfaces with image
sources,” J. Acoust. Soc. Amer., vol. 131, no. 6, pp. 4606–
4614, 2012.
[3] S. J. Schlecht and E. A. P. Habets, “Feedback delay networks:
Echo density and mixing time,” IEEE/ACM Trans. Audio,
Speech, Lang. Proc., vol. 25, no. 2, pp. 374–383, 2017.
[4] V. Välimäki, J. D. Parker, L. Savioja, J. O. Smith III, and
J. S. Abel, “Fifty years of artificial reverberation,” IEEE/ACM
Trans. Audio, Speech, Lang. Proc., vol. 20, no. 5, pp. 1421–
1448, July 2012.
[5] M. A. Gerzon, “Synthetic stereo reverberation: Part One,” Stu-
dio Sound, vol. 13, pp. 632–635, 1971.
[6] J. M. Jot and A. Chaigne, “Digital delay networks for design-
ing artificial reverberators,” in Proc. Audio Eng. Soc. Conv.,
Paris, France, Feb. 1991, pp. 1–12.
[7] D. Rocchesso, “Maximally diffusive yet efficient feedback de-
lay networks for artificial reverberation,” IEEE Signal Pro-
cess. Lett., vol. 4, no. 9, pp. 252–255, 1997.
[8] ——, “The Ball within the Box: A Sound-Processing
Metaphor,” Comput. Music J., vol. 19, no. 4, pp. 47–57, 1995.
[9] M. R. Schroeder and B. F. Logan, “"Colorless" artificial rever-
beration,” Audio, IRE Transactions on, vol. AU-9, no. 6, pp.
209–214, 1961.
[10] J. A. Moorer, “About this reverberation business,” Comput.
Music J., vol. 3, no. 2, pp. 13–17, June 1979.
[11] L. Dahl and J. M. Jot, “A Reverberator based on Absorbent
All-pass Filters,” in Proc. Int. Conf. Digital Audio Effects
(DAFx), Verona, Italy, Dec. 2000, pp. 1–6.
[12] S. J. Schlecht and E. A. P. Habets, “Time-varying feedback
matrices in feedback delay networks and their application in
artificial reverberation,” J. Acoust. Soc. Amer., vol. 138, no. 3,
pp. 1389–1398, Sept. 2015.
[13] P. Rubak and L. G. Johansen, “Artificial Reverberation Based
on a Pseudo-Random Impulse Response,” in Proc. Audio Eng.
Soc. Conv., Amsterdam, The Netherlands, May 1998, pp. 1–
14.
[14] S. J. Schlecht and E. A. P. Habets, “Modal decomposition
of feedback delay networks ,” IEEE Trans. Signal Process.,
2019.
[15] M. A. Gerzon, “Unitary (energy-preserving) multichannel
networks with feedback,” Electronics Letters, vol. 12, no. 11,
pp. 278–279, 1976.
[16] T. R. Cameron, “Spectral Bounds for Matrix Polynomials with
Unitary Coefficients,” Electronic Journal of Linear Algebra,
vol. 30, no. 1, pp. 585–591, Feb. 2015.
[17] J. S. Abel and P. P. Huang, “A Simple, Robust Measure of
Reverberation Echo Density,” in Proc. Audio Eng. Soc. Conv.,
San Francisco, CA, USA, Oct. 2006, pp. 1–10.
[18] J. Karhumäki, S. Puzynina, M. Rao, and M. A. Whiteland,
“On cardinalities of k-abelian equivalence classes,” Theoreti-
cal Computer Science, vol. 658, pp. 190–204, Jan. 2017.
[19] J. Cassaigne, J. Karhumäki, S. Puzynina, and M. A. White-
land, “k-Abelian Equivalence and Rationality,” Fundamenta
Informaticae, vol. 154, no. 1-4, pp. 65–94, Jan. 2017.
[20] P. P. Huang and J. S. Abel, “Aspects of reverberation echo den-
sity,” in Proc. Audio Eng. Soc. Conv., New York, NY, USA,
Oct. 2007, pp. 1–7.
[21] A. Lindau, L. Kosanke, and S. Weinzierl, “Perceptual Eval-
uation of Model- and Signal-Based Predictors of the Mixing
Time in Binaural Room Impulse Responses,” J. Audio Eng.
Soc., vol. 60, no. 11, pp. 887–898, 2012.