Content uploaded by Sebastian Schlecht

Author content

All content in this area was uploaded by Sebastian Schlecht on Oct 28, 2019

Content may be subject to copyright.

2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY

DENSE REVERBERATION WITH DELAY FEEDBACK MATRICES

Sebastian J. Schlecht and Emanuël A.P. Habets

International Audio Laboratories Erlangen∗, Germany

sebastian.schlecht@audiolabs-erlangen.de

ABSTRACT

Feedback delay networks (FDNs) belong to a general class of re-

cursive ﬁlters which are widely used in artiﬁcial reverberation and

decorrelation applications. One central challenge in the design of

FDNs is the generation of sufﬁcient echo density in the impulse

response without compromising the computational efﬁciency. In a

previous contribution, we have demonstrated that the echo density

of an FDN grows polynomially over time, and that the growth de-

pends on the number and lengths of the delays. In this work, we

introduce so-called delay feedback matrices (DFMs) where each

matrix entry is a scalar gain and a delay. While the computational

complexity of DFMs is similar to a scalar-only feedback matrix,

we show that the echo density grows signiﬁcantly faster over time,

however, at the cost of non-uniform modal decays.

Index Terms—Feedback Delay Network, Artiﬁcial Reverber-

ation, Echo Density

1. INTRODUCTION

If sound is emitted in a room, the sound waves travel across the

space and are repeatedly reﬂected at the room boundaries resulting

in acoustic reverberation [1]. If the sound is reﬂected at a smooth

boundary the reﬂection is coherent (specular), while it is incoher-

ent (scattered) when reﬂected by a rough surface. In small to large

rooms, the mean free path of specular reﬂections is between 10 -

100 ms, whereas the time scale of incoherent reﬂections is a dense

response of a few milliseconds [2]. In geometric room acoustic, an

incoherent reﬂection can be effectively generated from a set of close

image sources [2], see Fig. 1. Consequently, scattering increases the

total number of image sources and therefore the echo density [3]. In

this work, we propose a method to effectively introduce scattering-

like effects for artiﬁcial reverberation ﬁlter structure.

Many artiﬁcial reverberators have been developed in recent

years [4] among which the feedback delay network (FDN), orig-

inally proposed by Gerzon [5] and further developed by Jot and

Chaigne [6], is one of the most popular. The FDN consists of N

delay lines combined with attenuation ﬁlters which are fed back

via a scalar feedback matrix A. To maximize the echo density,

we assume throughout this work that feedback matrix Ais dense,

i.e., all matrix entries are non-zero [7]. In [3], the present authors

demonstrated that the absolute echo density of an FDN grows poly-

nomially with the degree equal to the number of delay lines Nand

inversely to the delay lengths, i.e., shorter delay lines produce faster

increasing absolute echo density. A major challenge of FDN design

is to achieve sufﬁcient echo density because of an inherent trade-

off between three aspects: computational complexity, mode density

∗The International Audio Laboratories Erlangen are a joint institution of

the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraun-

hofer Institut für Integrierte Schaltungen IIS.

Source

Receiver

Wall 1

Wall 2

First order

image sources

Second order

image sources

Figure 1: Second order reﬂections at two rough surfaces represented

with an equivalent set of image sources (inspired by [2]). The en-

ergy of the image sources is indicated by the color saturation.

and echo density. A higher number of delays increases both modal

and echo density, but also the computational complexity. With a

ﬁxed number delays, shorter delays increases the echo density, how-

ever, results simultaneously in a lower modal density [6].

To reproduce a scattering-like effect, we require a set of long

delays which are proportional to the mean free path [8], and a set of

ﬁlters to add the short-term density to each reﬂection (see Fig. 1).

Feedforward-feedback allpass ﬁlters have been introduced in series

to increase the short-term echo density [9, 10]. Alternatively, all-

pass ﬁlters may be placed after the delay lines [11] which in turn

doubles the effective size of the FDN [12]. Instead of allpass ﬁlters,

FIR ﬁlters with pseudo-random exponentially decaying coefﬁcients

were proposed [13].

In this contribution, we propose an FDN with a delay feedback

matrix (DFM), i.e., each matrix entry has a delay and a gain. We

demonstrate that the DFM can increase the echo density signiﬁ-

cantly without introducing further ﬁltering. In Section 2, we present

the new ﬁlter structure and compare it to the standard FDN. In Sec-

tion 3, we demonstrate the echo density of the proposed FDN. In

Section 4, an example structure is presented.

2. PROPOSED FEEDBACK DELAY NETWORK

In the following, we present the proposed extension to the FDN and

explain the stability conditions.

2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY

A(z)

X(z)b1

z−m1

c1Y(z)

b2

z−m2

c2

b3

z−m3

c3

Figure 2: Delay feedback delay network with three delays (N= 3)

and a delay feedback matrix A(z)instead of the standard scalar

feedback matrix A.

2.1. Delay Feedback Matrix

The proposed FDN is similarly structured as a standard FDN, but

employs a delay feedback matrix (DFM) A(z)instead of a scalar

feedback matrix A(see Fig. 2). The transfer function of the pro-

posed FDN is

H(z) = c>[Dm(z−1)−A(z)]−1b+d, (1)

where the column vectors band cof bi’s and ci’s, re-

spectively. The lengths of the Ndelays in samples are

given by m= [m1,...,mN]∈NN.Dm(z) =

diag[z−m1, z−m2,...,z−mN]is the diagonal N×Ndelay ma-

trix. The attenuation ﬁlters which are usually applied in the feed-

back loop are omitted for clarity and can be added as an additional

processing unit in series with the delay lines [6]. The N×NDFM

A(z)is given by

A(z) = U◦DM(z),(2)

where ◦denotes the Hadamard (element-wise) product, Uis a

scalar matrix and Mare the non-negative integer matrix delays and

the matrix entries of the delay matrix DM(z)are [DM(z)]ij =

z−Mij where 1≤i, j ≤N. If all matrix delays are zero, i.e.,

M=0, the DFM becomes the standard scalar feedback matrix.

For this reason, we denote the zero matrix delay with MS=0.

From a design perspective, the matrix delays of the DFM M

should be relatively small compared to the main delays m. This

is because the attenuation ﬁlters are typically proportional to the

main delays and every deviation of this proportionality by the DFM

distorts the reverberation time speciﬁcation. To achieve a scattering-

like effect, the matrix delays Mare set to less than 10% of the

corresponding main delays m(see Section 3).

The implementation of the DFM is similar to the scalar feed-

back matrix. The main difference is that the signal vector is re-

trieved by N2memory access operations instead of N. All other

arithmetic operations, i.e., element-wise multiplication with Uand

the writing operations to the delay lines, are unaltered. Conse-

quently, the additional computational complexity introduced by the

DFM is relatively small.

2.2. Stability

The poles of the proposed FDN are the eigenvalues of the polyno-

mial matrix P(z) = Dm(z−1)−A(z), which in turn are the roots

of the polynomial p(z) = det(P(z)) [14]. The polynomial degree

of p(z)is then equal to the total number of system poles of the pro-

posed FDN. FDNs are commonly designed as lossless systems, i.e.,

all system poles lie on the unit circle, to allow precise control over

the reverberation time by introducing additional proportional atten-

uation ﬁlters [6]. The lossless property of general unitary-networks

which in particular applies to the proposed FDN, was described by

Gerzon [15]. A sufﬁcient lossless condition is given by A(z)being

paraunitary, i.e., A(z−1)>A(z) = Ifor real coefﬁcients, where I

is the identity matrix [15].

Consequently, an FDN with a scalar feedback matrix, i.e.,

MS=0, is lossless if Uis unitary. Further, we show that A(z)is

paraunitary, if

MP

ij =mout

i+min

jfor 1≤i, j ≤N, (3)

where mout,min ∈NNare vectors of integer delays in samples

and we denote MP

ij as the matrix elements of matrix delays MP.

From close inspection, it can be observed that

A(z) = U◦DMP(z) = Dmin (z)U Dmout (z).(4)

As paraunitary matrices are closed under multiplication and

Dmin (z)and Dmout (z)are paraunitary, we can conclude that

A(z)is paraunitary, if Uis unitary.

Unlike the scalar case MSand the paraunitary case MP,A(z)

is not necessarily paraunitary for other matrix delays MNP such

that no lossless FDN might exist for such matrix delays. In the

following, we show a sufﬁcient condition for the proposed FDN to

be stable instead. To establish the stability condition, we recall a

version of Rouché’s theorem for polynomial matrices:

Theorem 1 (Rouché, [16]).Let Q(z)and R(z)be matrix polyno-

mials. Assume that Q(z)is invertible on the simple closed curve

γ⊂G. If kQ(z)−1R(z)k2<1for all z∈γ, then Q(z) + R(z)

and Q(z)have the same number of eigenvalues inside γ, counting

multiplicities.

With this theorem, we can prove the following stability condi-

tion:

Theorem 2. The proposed FDN in (1) is stable if

kA(eıω)k2<1for all ω . (5)

Proof. So, let us assign Q(z) = Dm(z−1)and R(z) = −A(z)

such that P(z) = Q(z) + R(z). Further, let us choose the unit cir-

cle as the closed curve γ. Now, all the eigenvalues of Q(z)are zero

and therefore within γ. For z∈γ, we have z=eıω and Q(eıω)

is unitary for all ωsuch that kQ(eıω)−1R(eıω )k2=kR(eıω )k2.

Thus according to Theorem 1, all eigenvalues of P(z)are within

the unit circle if kR(eıω)k2<1.

Theorem 2 provides also practical means to stabilize any non-

stable FDN by dividing A(z)by maxωkA(eıω)k2. Nonetheless,

the missing lossless property makes the non-paraunitary DFM more

tuning-intensive to avoid unwanted ringing modes as the modal de-

cays are possibly non-uniform. While there might be possible op-

timization schemes to improve the modal decay distribution, in this

pioneering study, we choose the non-paraunitary DFM empirically.

2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY

m1

m2

time n

(a) Standard FDN with scalar feedback matrix.

m1

m2

time n

(b) Proposed FDN with a paraunitary DFM.

m1

m2

time n

(c) Proposed FDN with a non-paraunitary DFM.

Figure 3: Echo lattice FDNs with two delay lines and three different DFMs: scalar MS, paraunitary MPand non-paraunitary MNP . Each

dot indicates a single echo. The x and y coordinates indicate the time component associated with the ﬁrst and second delay line, respectively,

m1and m2. The echo time nis indicated by the diagonal dotted line. The matrix delays Mare added diagonally. The red arrows indicate

the three paths p∈ {[1,2,2],[2,1,2],[2,1,1]}. More details regarding the echo lattice can be found in [3].

3. ECHO DENSITY

In this section, we present the echo density1of the proposed FDN

and show that it grows signiﬁcantly faster with reﬂection order for

paraunitary and non-paraunitary DFMs.

3.1. Absolute Echo Density

The absolute echo density, i.e., the number of echoes at the output

per time unit, of the standard FDN is polynomial over time and

the exact polynomial is given in [3]. Echoes are closely related

to echo paths pdescribing the sequence of delay lines which are

traversed by an impulse input until it reaches the output. Formally,

an echo path can be deﬁned as p∈ {1,2,...,N}lof length lwhich

denotes the number of involved delay lines. The echo time npis

deﬁned as the time at which the echo appears at the output of the

FDN. In standard FDN, the echo time npof path pis given by

np=Pl

i=1 mpi[3], whereas the more general version for the

proposed FDN is

np=

l

X

i=1

mpi+

l−1

X

i=1

Mpi+1,pi.(6)

We rewrite (6) in terms of the delay count

[cp]i=|{k|pk=i}| for 1≤i≤N(7)

and the matrix routing count

[Cp]ij =|{k|pk=jand pk+1 =i}| for 1≤i, j ≤N, (8)

where |·| denotes the cardinality of a set. Hence, the delay and ma-

trix routing counts denote the number of occurrences of delay lines

and matrix delays in an echo path p, respectively. These counts sim-

plify the following development as they fully determine the echo

time np. In fact, the echo time of path pmay be expressed by

np=

N

X

i=1

[cp◦m]i+

N

X

i,j=1

[Cp◦M]ij .(9)

1Although echo typically refers to a subjectively distinct reﬂection [1],

we refer to a non-zero output of the FDN as an echo. This deﬁnition is in

accordance to the term echo density in the literature [17, 3].

It is crucial to observe that the echo time npfor a scalar feed-

back matrix (MS=0) is invariant under permutations of pbe-

cause cpis invariant under permutations. The number of paths pof

length lwith different echo times npis

ES(l) = N+l−1

N−1!.(10)

This quantity is a special case of the equilateral approximation given

in [3]. For a paraunitary DFM with MPas in (3) with mout and

min, we have

MP

pi+1,pi=mout

pi+1,+min

pi(11)

such that npis invariant under permutation except for the ﬁrst and

last element. Thus, the number of paths pof length l≥2with

different echo times npis

EP(l) = N2 N+l−3

N−1!.(12)

Because of the bound on the binomial coeffcients q

r≤qr

r!, the

number of paths ES(l)and EP(l)grows polynomially with length

l. For a non-paraunitary DFM with properly chosen matrix de-

lays MNP, the echo time npis different for each permutation of

pwhere the contribution of MNP is different. More precisely,

the echo times of two paths pand qare equal if cp=cqand

Cp=Cq. The number of paths ENP(l)is difﬁcult to give in

closed form. In fact, ENP(l)is equal to the number of 2-abelian

equivalence classes of words of length lover an alphabet of size N

[18]. In [19, Theorem 5.1], Cassaigne et al. recently showed that

ENP(l)is asymptotically equal to r lN(N−1) ,(13)

where ris a rational constant depending on N. Thus, ENP (l)grows

polynomially, however with degree N(N−1) instead of N. Fig. 3

illustrates the echo times of an FDNs with different DFMs in the

echo lattice. In the following, we present a quantitative analysis of

the absolute echo densities depending on the feedback matrix.

The derivation of a closed form expression of the absolute echo

density requires the usages of sophisticated methods (see [3]) which

is beyond the scope for this work. However, if we neglect the spe-

ciﬁc delay lengths, the number of echo paths with discrete echo

times can be related to the absolute echo density. A ﬁrst approx-

imation of the absolute echo density can be given by scaling the

0 500 1,000 1,500 2,000 2,500 3,000

0

2

4

6

Time [ms]

Amplitude and Echo Density [lin]

SP NP

Figure 4: Echo density proﬁle and impulse response of the proposed

FDN with three different DFMs: scalar (S), paraunitary (P), and,

non-paraunitary (NP) and damping factor γ. The solid line indicates

the impulse response, whereas the dashed line is the measured echo

density proﬁle. The three plots are offset for better readability by 2.

number of echo paths with the geometric mean of the main delay

lengths QN

i=1 m1/N

i.

Similar to the standard FDN case, the main delays mand delay

matrix Mshould be chosen such that as few as possible echo times

npcoincide. Naturally, this condition is only satisﬁable in the early

part of the impulse response and that is only where it mostly matters

from a perceptual point of view [3].

3.2. Echo Density Proﬁle

Abel and Huang proposed in [17, 20] the echo density proﬁle. The

underlying assumption is that the sound pressure amplitudes in a

reverberant sound ﬁeld exhibit a Gaussian distribution. With a short

sliding rectangular window of 23 ms, the empirical standard de-

viation of impulse response amplitudes is calculated. To deter-

mine how well the empirical amplitude distribution approximates

a Gaussian behavior, the proportion of samples outside the empir-

ical standard deviation is determined and compared to the propor-

tion expected for a Gaussian distribution. With increasing time and

increasing echo density, the echo density proﬁle increases reaches

one for a fully dense frame of the impulse response. The perceptual

validity was conﬁrmed in [21].

4. EXAMPLES

We give an example of the proposed FDN and compare it to a

standard FDN. The number of delay lines N= 4, and the delays

are m= [15805,5001,9535,7201] samples, with a sampling fre-

quency of 48 kHz. The delays are chosen extremely high on pur-

pose to make the difference in echo density both easily audible and

visible. Nonetheless, the echo density is affected equivalently for

proportionally shorter delays. The matrix gains are an orthogonal

Hadamard matrix

U=

0.5 0.5 0.5 0.5

0.5−0.5 0.5−0.5

0.5 0.5−0.5−0.5

0.5−0.5−0.5 0.5

(14)

Table 1: Number of paths with different echo times for N= 4 and

the three different delay matrices MS,MP, and, MNP .

l2345678

ES(l)10 20 35 56 84 120 165

EP(l)16 64 160 320 560 896 1344

ENP(l)16 64 244 856 2728 7892 20876

and input band output gains care all 1. The delay matrices are

MP=

456 1 10 447

751 296 305 742

511 56 65 502

647 192 201 638

,

MNP =

963 950 556 770

139 858 489 21

286 3 773 137

610 525 162 117

.

We have introduced a reverberation time of approximately 3 s by

replacing each delay element z−1by γz−1, where γ= 0.99995

for the scalar and paraunitary case, and γ= 0.99992 for the non-

paraunitary case to guarantee stability according to Theorem 2. Ta-

ble 1 shows the number of paths with different echo times for the

three DFM types: ES(l),EP(l), and, ENP(l). As shown in Sec-

tion 3, the number of echoes produced by the paraunitary and non-

paraunitary DFMs are signiﬁcantly higher than for the scalar DFM.

Fig. 4 shows the impulse responses and the corresponding echo

density proﬁles. As predicted by the number of echo paths in Sec-

tion 3, the echo density proﬁle of the impulse responses increases

from the scalar (S) to the paraunitary (P), and further to the non-

paraunitary (NP) feedback matrix. For instance, the echo density

proﬁle at time 1.5 s in this order is 0.05, 0.25, and, 0.82, respec-

tively. For illustration, we have synthesized audio examples from

the three DFMs depicted in Fig. 4 and provided them online2. It is

easily audible that the impulse responses of (P) and (NP) become

dense much more quickly than for (S). However, at the end of the

impulse response of (NP), slight metallic ringing is perceivable, due

to the non-uniform distribution of modal decays caused by the non-

lossless property discussed in Section 2.

5. CONCLUSION

In this work, we propose an extension to the feedback delay net-

work (FDN) by replacing the scalar feedback matrix with a delay

feedback matrix (DFM), which introduces a different delay for each

feedback matrix entry. Based on the stability property, we present

two types of DFMs: paraunitary and non-paraunitary. The echo

density of FDNs with such DFMs is discussed in terms of the num-

ber of echo paths of increasing length. Alternatively, we analyze

the perceived echo density of the impulse responses with the echo

density proﬁle. In an example, we show that both proposed DFMs

increases the echo density signiﬁcantly. While the non-paraunitary

DFM surpasses the echo density of the paraunitary DFM, it also in-

troduces metallic ringing due to non-uniform modal decays which

impedes the tuning for high-quality artiﬁcial reverberation.

2https://www.audiolabs-erlangen.de/resources/

2019-WASPAA- DFM-FDN/

6. REFERENCES

[1] H. Kuttruff, Room Acoustics, Fifth Edition. CRC Press, June

2009.

[2] S. Siltanen, T. Lokki, S. Tervo, and L. Savioja, “Modeling

incoherent reﬂections from rough room surfaces with image

sources,” J. Acoust. Soc. Amer., vol. 131, no. 6, pp. 4606–

4614, 2012.

[3] S. J. Schlecht and E. A. P. Habets, “Feedback delay networks:

Echo density and mixing time,” IEEE/ACM Trans. Audio,

Speech, Lang. Proc., vol. 25, no. 2, pp. 374–383, 2017.

[4] V. Välimäki, J. D. Parker, L. Savioja, J. O. Smith III, and

J. S. Abel, “Fifty years of artiﬁcial reverberation,” IEEE/ACM

Trans. Audio, Speech, Lang. Proc., vol. 20, no. 5, pp. 1421–

1448, July 2012.

[5] M. A. Gerzon, “Synthetic stereo reverberation: Part One,” Stu-

dio Sound, vol. 13, pp. 632–635, 1971.

[6] J. M. Jot and A. Chaigne, “Digital delay networks for design-

ing artiﬁcial reverberators,” in Proc. Audio Eng. Soc. Conv.,

Paris, France, Feb. 1991, pp. 1–12.

[7] D. Rocchesso, “Maximally diffusive yet efﬁcient feedback de-

lay networks for artiﬁcial reverberation,” IEEE Signal Pro-

cess. Lett., vol. 4, no. 9, pp. 252–255, 1997.

[8] ——, “The Ball within the Box: A Sound-Processing

Metaphor,” Comput. Music J., vol. 19, no. 4, pp. 47–57, 1995.

[9] M. R. Schroeder and B. F. Logan, “"Colorless" artiﬁcial rever-

beration,” Audio, IRE Transactions on, vol. AU-9, no. 6, pp.

209–214, 1961.

[10] J. A. Moorer, “About this reverberation business,” Comput.

Music J., vol. 3, no. 2, pp. 13–17, June 1979.

[11] L. Dahl and J. M. Jot, “A Reverberator based on Absorbent

All-pass Filters,” in Proc. Int. Conf. Digital Audio Effects

(DAFx), Verona, Italy, Dec. 2000, pp. 1–6.

[12] S. J. Schlecht and E. A. P. Habets, “Time-varying feedback

matrices in feedback delay networks and their application in

artiﬁcial reverberation,” J. Acoust. Soc. Amer., vol. 138, no. 3,

pp. 1389–1398, Sept. 2015.

[13] P. Rubak and L. G. Johansen, “Artiﬁcial Reverberation Based

on a Pseudo-Random Impulse Response,” in Proc. Audio Eng.

Soc. Conv., Amsterdam, The Netherlands, May 1998, pp. 1–

14.

[14] S. J. Schlecht and E. A. P. Habets, “Modal decomposition

of feedback delay networks ,” IEEE Trans. Signal Process.,

2019.

[15] M. A. Gerzon, “Unitary (energy-preserving) multichannel

networks with feedback,” Electronics Letters, vol. 12, no. 11,

pp. 278–279, 1976.

[16] T. R. Cameron, “Spectral Bounds for Matrix Polynomials with

Unitary Coefﬁcients,” Electronic Journal of Linear Algebra,

vol. 30, no. 1, pp. 585–591, Feb. 2015.

[17] J. S. Abel and P. P. Huang, “A Simple, Robust Measure of

Reverberation Echo Density,” in Proc. Audio Eng. Soc. Conv.,

San Francisco, CA, USA, Oct. 2006, pp. 1–10.

[18] J. Karhumäki, S. Puzynina, M. Rao, and M. A. Whiteland,

“On cardinalities of k-abelian equivalence classes,” Theoreti-

cal Computer Science, vol. 658, pp. 190–204, Jan. 2017.

[19] J. Cassaigne, J. Karhumäki, S. Puzynina, and M. A. White-

land, “k-Abelian Equivalence and Rationality,” Fundamenta

Informaticae, vol. 154, no. 1-4, pp. 65–94, Jan. 2017.

[20] P. P. Huang and J. S. Abel, “Aspects of reverberation echo den-

sity,” in Proc. Audio Eng. Soc. Conv., New York, NY, USA,

Oct. 2007, pp. 1–7.

[21] A. Lindau, L. Kosanke, and S. Weinzierl, “Perceptual Eval-

uation of Model- and Signal-Based Predictors of the Mixing

Time in Binaural Room Impulse Responses,” J. Audio Eng.

Soc., vol. 60, no. 11, pp. 887–898, 2012.