Content uploaded by Orchisama Das
Author content
All content in this area was uploaded by Orchisama Das on Nov 19, 2019
Content may be subject to copyright.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
ON THE BEHAVIOR OF DELAY NETWORK REVERBERATOR MODES
Orchisama Das, Elliot K. Canfield-Dafilou, Jonathan S. Abel
Center for Computer Research in Music and Acoustics,
Stanford University, Stanford, CA 94305 USA,
orchi|kermit|abel@ccrma.stanford.edu
ABSTRACT
The mixing matrix of a Feedback Delay Network (FDN) rever-
berator is used to control the mixing time and echo density profile.
In this work, we investigate the effect of the mixing matrix on the
modes (poles) of the FDN with the goal of using this information
to better design the various FDN parameters. We find the modal
decomposition of delay network reverberators using a state space
formulation, showing how modes of the system can be extracted
by eigenvalue decomposition of the state transition matrix. These
modes, and subsequently the FDN parameters, can be designed to
mimic the modes in an actual room. We introduce a parameterized
orthonormal mixing matrix which can be continuously varied from
identity to Hadamard. We also study how continuously varying dif-
fusion in the mixing matrix affects the damping and frequency of
these modes. We observe that modes approach each other in damp-
ing and then deflect in frequency as the mixing matrix changes from
identity to Hadamard. We also quantify the perceptual effect of in-
creasing mixing by calculating the normalized echo density (NED)
of the FDN impulse responses over time.
Index Terms—Feedback Delay Network, Artificial Reverber-
ation, Modal Analysis, Normalized Echo Density, Mixing Matrix
1. INTRODUCTION
Artificial reverberation techniques aim to synthesize an impulse re-
sponse using a combination of filters. An ideal artificial reverber-
ator reproduces a set of sparse early reflections which increase in
density over time, building toward late reverberation where the im-
pulse density is high and statistically Gaussian. In other words,
the “echo-density” (number of echoes per second) should increase
quadratically in time and is a good psychoacoustic measure of per-
ceived reverberation. Another characteristic of an ideal reverberator
is the frequency response—the number of modes should increase
with frequency squared, which means high frequency modes are
densely packed.
The first artificial reverberators were proposed by Schroeder [1]
and used comb filters in parallel followed by allpass filters in series.
While this architecture produces the desired increase in echo density
over time, unnatural coloration in the impulse response persisted
[2]. Gerzon [3] generalized delay network reverberators, suggesting
the use of a unitary feedback matrix to mix the outputs of the delay
lines into the inputs of each other. Jot and Chaigne later proposed
the Feedback Delay Network (FDN) [4, 5] which matched the delay
line lengths with shelf filters designed to yield a desired frequency
dependent T60. Since then, FDNs have gained popularity for creat-
ing efficient artificial reverberation. Some important contributions
in FDN research include [6, 7] in which the authors propose a cir-
culant feedback matrix for efficient implementation and maximum
diffusion, and those by Schlecht [8, 9] which deal with time-varying
FDNs and their practical implementation. Schlecht also studied the
properties of mixing matrices that produce lossless FDNs [10]. In a
recent paper [11], he investigated the modal decomposition of feed-
back delay networks using the Ehrlich-Aberth iteration for finding
poles and also studied the statistical distribution of mode frequen-
cies and amplitudes. In this paper, we explicitly derive the state
transition matrix for the case of frequency dependent T60s.
The design of an optimum mixing matrix to produce a desired
perceptual effect is still somewhat ambiguous. In this paper, we
first derive the FDN state transition matrix for scalar gains and first-
order shelf filters (for frequency dependent decay rates), and show
that the poles (modes) of the system are given by eigenvalues of the
state transition matrix. Next, we use the concept of homotopy [12]
to gradually alter the mixing matrix from a state of minimum dif-
fusion to maximum diffusion and observe how the pole trajectories
vary with mixing. We observe coupling between nearby modes,
where they first approach each other and then deflect, similar to
what was observed by Weinreich in piano strings [13]. We discuss
different ways of designing delay line T60 filters to model rooms
which have walls made of different materials. Then, we calculate
the normalized echo density (NED) [14] of the impulse responses
at different levels of mixing and compare how mixing affects the
time at which the late reverb starts (mixing time). NED has been
shown to be a good indicator of the psychoacoustic perception of
reverb [15, 16]. The effect of mixing time on echo density, which is
shown to be a polynomial function dependent on delay line lengths,
has been studied in [17]. Using curve fitting and empirical estima-
tion, we derive a parametric function relating mixing time to mean
delay line length and mixing matrix. This relationship between mix-
ing time and level of diffusion sheds some light on designing FDN
mixing matrices to achieve a desired perceptual effect.
The rest of the paper is organized as follows: in §2 we setup the
FDN state space equations and derive the state transition matrix for
frequency independent and frequency dependent decay rates. We
show that the modes of the FDN are eigenvalues of the state tran-
sition matrix. In §3, we discuss how to smoothly vary the mixing
matrix from zero to maximum diffusion. In §4, we see how mixing
affects the mode trajectories and discuss how different decay filters
can be designed. We also generate FDN impulse responses for dif-
ferent mixing matrices and show how their NEDs evolve with time.
Finally, we come up with a parametric equation relating the mix-
ing time to the mean delay line length and the mixing matrix. We
conclude the paper in §5 and delineate scope for future work.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
2. FEEDBACK DELAY NETWORK MODAL
DECOMPOSITION
Let us consider a feedback delay network with Ndelay lines of
length τ1, τ2,...,τNsamples each as shown in Fig. 1. The mix-
ing matrix denoted by Mis typically orthonormal and of the order
(N×N), the 60 dB decay time of each delay line (T60) is controlled
by the filter gi(z). The input and output at time nis given by u(n)
and y(n)respectively. The vectors band cdenote the input and
output gains and dis the direct path gain.
The FDN can be represented as a state space system, with the
state vector xdefined by the values stored in each memory element.
x= [x1, x2,...,xτ1, xτ1+1,...,xPiτi]T(1)
With a state transition matrix A, the state space equations can be
written as
x(n) = Ax(n−1) + bu(n)
y(n) = cTx(n) + du(n)
Y(z) = hcT(zI−A)−1b+diU(z)
(2)
2.1. State Transition Matrix
It is obvious from (2) that the modes of the FDN are the eigenvalues
of the matrix A. In this section, we will derive Afor two cases—
first the case of frequency independent T60s, in which case the delay
line filter is a scalar gain, and second the more commonly used fre-
quency dependent T60 case, in which gi(z)is typically given by a
low-shelf filter.
Since Ais a large sparse matrix of the order (PN
i=1 τi×
PN
i=1 τi), we will use block matrix notation, and denote xas a
stacked vector [˜
xT
1,...,˜
xT
N]and Aas a block matrix with N×N
sub-matrices.
˜
x1(n)
˜
x2(n)
.
.
.
˜
xN(n)
=
˜
A11 ˜
A12 . . . ˜
A1N
˜
A21 ˜
A22 . . . ˜
A2N
.
.
..
.
.....
.
.
˜
AN1˜
A2N. . . ˜
ANN
˜
x1(n−1)
˜
x2(n−1)
.
.
.
˜
xN(n−1)
(3)
2.1.1. Frequency independent decay times
In this case, gi(z) = gifor i= 1 . . . N . Each sub-vector of the
state vector xcan be written as
˜
xT
i= [xPi−1
j=1 τj+1,...,xPi
j=1 τj]T(4)
The sub-matrices, ˜
Aij are of the order (τi×τj) for i, j = 1 . . . N .
The diagonal blocks can be written as
˜
Aii =
0. . . 0giMii
1. . . 0 0
.
.
.....
.
..
.
.
0. . . 1 0
,(5)
where Mii is the i, ith element of M. The off-diagonal blocks have
zeros everywhere, except the (1, τj)-th element which is given by
˜
Aij (1, τj) = gjMji (6)
M
d
b1
b2
b3
b4
z−1z−1... z−1
τ1
x1x2··· xτ1
z−1z−1... z−1
τ2
xτ1+1 ··· xτ1+τ2
z−1z−1... z−1
τ3
xτ1+τ2+1 ··· xτ1+τ2+τ3
z−1z−1... z−1
τ4
xτ1+τ2+τ3+1 ···xτ1+τ2+τ3+τ4
g1(z)
g2(z)
g3(z)
g4(z)
c1
c2
c3
c4
v1
v2
v3
v4
+
u(n)
y(n)
Figure 1: State space FDN block diagram.
+
z−1
+
wi(n)
x(n)b0i
b1i−a1i
vi(n)
Figure 2: First order shelf filter block diagram.
2.1.2. Frequency dependent decay times
In this case, the delay-line filter is a first-order low shelf filter of the
form
gi(z) = b0i+b1iz−1
1−a1iz−1(7)
This ensures that the lower frequency modes have a longer decay
than the higher frequency modes, which is consistent with what is
observed in actual rooms. We design a shelf filter by setting the
filter coefficients according to desired gains at DC and Nyquist. The
direct form II diagram of the shelf filter is given in Fig. 2. The state
vector now has an additional state wiper sub-vector.
˜
xT
i= [xPi−1
j=1 τj+1,...,xPi
j=1 τj, wi]T(8)
The sub-matrices, ˜
Aij are of the order (τi+ 1 ×τj+ 1) for
i, j = 1 . . . N . The diagonal blocks can be written as
˜
Aii =
0. . . 0b0iMii Mii
1. . . 0 0 0
.
.
.....
.
..
.
..
.
.
0. . . 1b1i−a1ib01 −a1i
(9)
The off-diagonal blocks have zeros everywhere, except the (1, τj)
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
and (1, τj+ 1)-th elements which are given by
˜
Aij (1, τj) = b0jMji ,
˜
Aij (1, τj+ 1) = Mj i
(10)
3. HOMOTOPY
Now that we have derived the state transition matrix of the FDN,
hence its modes, we wish to observe how these modes change as
we modify the mixing matrix M. For this, we must alter the mix-
ing matrix smoothly from a state of no mixing to maximum mix-
ing. A slow and continuous change from one function to another
is known as homotopy [12]. In [8], the authors create a feedback
matrix evolution equivalent to a recursive update using linear mod-
ulation functions. In this paper, we parameterize Mas a function
of θ. Starting with the (2×2) rotation matrix, R(θ)which is or-
thonormal, we can generate an (N×N)orthonormal mixing matrix
M(θ)by taking the Kronecker product of R(θ)with itself log2N
times. It is to be noted that the Kronecker product of two orthonor-
mal matrices is also orthonormal [18].
R(θ) = cos θsin θ
−sin θcos θ
MN×N(θ) = R(θ)⊗R(θ)⊗. . . R(θ)
(11)
Starting with θ= 0 gives us M(0) = IN×N, i.e, Nparallel
delay lines with no mixing among them. By incrementally increas-
ing θby radians, new mixing matrices M(θ+)are generated
with increased mixing among delay lines, until θ=π
4, when the
mixing matrix becomes Hadamard and depicts the case of maxi-
mum mixing. From now on, we will define θas the mixing angle,
and parametrize the mixing matrix as M(θ).
4. RESULTS AND DISCUSSION
4.1. Mode Trajectories
In Fig. 3 we show the modes in the complex plane and the frequency
dependent T60 for a FDN with two delay lines, with τ1= 5 (in red)
and τ2= 19 (in blue); g1(z)is a constant with T60 = 40 sam-
ples, g2(z)is a low-shelf filter with T60DC = 150 samples and
T60Nyq uist = 50 samples. The color intensity increases from light
to dark as θgoes from 0to π
4. We have squared the radii of the
modes in Fig. 3a to make the pole movement easier to see.
The modes are in complex conjugate pairs. With no mixing,
they are distributed uniformly in frequency. As mixing increases,
the modes in the delay lines that are close in frequency approach
each other in damping rapidly, and then deflect in frequency. This
coupling behavior is more accentuated in the lower frequencies,
with the modes at DC approaching each other and bifurcating. The
modes at higher frequencies show less movement, and the mode at
Nyquist remains stationary. In fact, once fully mixed, the shorter
delay line that started with a constant T60 shows low-pass behavior.
The coupling among these modes is similar to what was ob-
served by Weinreich in piano strings that are tuned with a slight
deviation in frequency [13]. Due to this mistuning, the strings have
slightly different angular frequencies and are coupled due to the dy-
namic motion of the bridge. The frequencies of the normal modes
of the strings as a function of the deviation in angular frequency
follow the same trajectory as our delay line modes when mixing
increases.
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
(a) Poles in the complex plane.
-1 -0.5 0 0.5 1
Normalized Frequency (rad/s)
0
50
100
150
T60 in samples
(b) Frequency dependent T60.
Figure 3: Modes of FDN for two delay line case with varying
amounts of mixing. Red indicates a delay line with five taps and
a scalar gain. Blue indicates a delay line with nineteen taps with
a shelf filter. Lighter colors show the low mixing case and darker
colors maximum mixing.
4.2. Designing different decay filters
Typically in an FDN reverberator, the decay filters are designed
such that all delay lines independently produce the same T60 fre-
quency response. However, it is likely that the physical configura-
tion of a room would require multiple, concurrent T60 responses.
One might use one set of filters to model air absorption while an-
other set of filters can be used to model the absorption due to the
materials in the room. For example, a church might have a pair
of parallel walls with glass windows, and another set of walls cov-
ered with drapes. The absorption coefficient of glass decreases with
increase in frequency, so its T60 response can be modeled by a low-
shelf filter, while the T60 response of the drapery is better modeled
with low-shelf resonant filter [19]. It would be realistic to assign
part of the delay line filters to model the glass and the others to
model the drapery. In [20], the T60 filters were designed such that
all walls mimic the frequency-dependent absorption of cotton car-
pet. The mixing matrix could be adjusted to emulate the occupancy
of the church—adding furniture would increase interaction between
the modes, which is equivalent to increasing the mixing among the
delay lines. Similarly, FDNs could have different filters in the delay
lines with high diffusion to mimic the reverberant characteristics of
coupled rooms [21]. Designing mixing matrices for coupled rooms
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
0
2
Amplitude
10 20 50 100 200 500 1000 2000
Time(ms)
0
0.5
1
NED
Figure 4: Impulse Responses and NED profiles for mixing matrices
with θset to π
40 ,π
8, and π
4, corresponding to 10%,50%, and 100%
mixing respectively (bottom to top).
with bleed between them has been previously studied in [22].
4.3. Echo Density and Mixing Time
In Fig. 4, we show how the normalized echo density profile (cal-
culated with a 50 ms Hanning window) changes with change in
mixing angle θfor a FDN with 16 delay lines, with delay lengths
uniformly distributed between 10–20 ms at a sampling frequency
of 48 kHz with T60DC = 4 s and T60Ny quist = 2 s. As ex-
pected, as the amount of mixing increases, the echo density be-
comes Gaussian more quickly. Audio examples of the impulse
responses are available at https://ccrma.stanford.edu/
˜orchi/FDN/IR.html
A FDN artificial reverberator has many parameters that need
to be tuned, such as mixing matrix, number of delay lines, their
lengths etc. to produce a desired perceptual effect. The tuning of
these parameters is still considered somewhat an art. The mixing
time of a feedback delay network is defined as the time in which
the echo density reaches a threshold, T= 0.9in this case [15]. In
[17], the authors come up with a closed form solution for choosing
a mean delay length, given a desired mixing time. We wish to relate
mixing time with the mixing angle. Figure 5 shows the mixing time
against the mixing angle for the same FDN for 4 mean delay line
lengths, ¯τof 10,20,50, and 100 ms. For each mean delay line
length, we perform Monte Carlo simulations, such that each of the
16 delay line lengths are random and uniformly distributed on the
interval
τi∼ U ¯τi
φ,¯τiφ,(12)
where φis the golden ratio, and plot the average mixing time. An
exponential relationship is observed between mixing time and mix-
ing angle θ. For θ= 0, there is no mixing and the echo density
never becomes Gaussian, which means the mixing time is theoreti-
cally infinite. For a given mean delay line length, it is possible to fit
a curve to this plot using non-linear least squares and estimate what
value of θwould produce the desired mixing time. The parametric
equation for tmix in seconds, given θand a mean delay line length
Fraction of mixing
Mixing time(s)
10ms
20ms
50ms
100ms
Figure 5: Mixing time as a function of mixing angle θ(divided
by π
4) for average delay line lengths ¯τof 10,20,50, and 100 ms
(bottom to top). Dashed lines are fitted curves using the parametric
equation.
of ¯τin ms is given in (13).
tmix =αexp (−βθ) + γ
α= 0.397¯τ−20.02γ
β= 20.246 −0.09¯τ
γ= 0.0075¯τ
(13)
Combined with Schlecht’s method for predicting mean delay
line length [17], our method for picking a mixing matrix M(θ)to
achieve a desired mixing time will aid in the design of FDN re-
verberators. These results are also directly applicable to the FDN
resizing algorithm described in [23].
5. CONCLUSION
We have studied the modal decomposition of a Feedback Delay Net-
works. While the modal decomposition for a FDN with a scalar
gain is well known [6], we explicitly derived the modal decompo-
sition with a shelf filter in the delay lines. We changed the mixing
matrix smoothly from minimum mixing to maximum mixing and
observed coupling among nearby modes from different delay lines.
We used a 2delay line example with a small number of modes for
visualizing modal behavior. We also studied how mixing affects the
echo density profile and mixing time of the FDN impulse response,
and came up with a parametric equation to choose a mixing matrix,
given a mean delay line length and a desired mixing time.
We could not do modal decompositions of FDNs with longer
delay lines that are commonly used in practice, because of the lim-
ited efficiency of MATLAB’s eig function when dealing with large
matrices. Nor could we find a good numeric method that could effi-
ciently and accurately give us all the eigenvalues of a large, sparse,
non-symmetric matrix. It would be interesting to use the numerical
method described in [11] to find the eigenvalues of large state tran-
sition matrices with many more states in a future work. We could
then generalize the observed coupled modal behavior in this paper
for practical FDNs with thousands of modes.
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 20-23, 2019, New Paltz, NY
6. REFERENCES
[1] M. R. Schroeder, “Natural sounding artificial reverberation,”
Journal of the Audio Engineering Society, vol. 10, no. 3, pp.
219–223, 1962.
[2] J. A. Moorer, “About this reverberation business,” Computer
music journal, pp. 13–28, 1979.
[3] M. Gerzon, “Unitary (energy-preserving) multichannel net-
works with feedback,” Electronics Letters, vol. 12, no. 11, pp.
278–279, 1976.
[4] J.-M. Jot and A. Chaigne, “Digital delay networks for de-
signing artificial reverberators,” in Audio Engineering Society
Convention 90. Audio Engineering Society, 1991.
[5] J.-M. Jot, “An analysis/synthesis approach to real-time artifi-
cial reverberation,” in [Proceedings] ICASSP-92: 1992 IEEE
International Conference on Acoustics, Speech, and Signal
Processing, vol. 2. IEEE, 1992, pp. 221–224.
[6] D. Rocchesso and J. O. Smith, “Circulant and elliptic feed-
back delay networks for artificial reverberation,” IEEE Trans-
actions on Speech and Audio Processing, vol. 5, no. 1, pp.
51–63, 1997.
[7] D. Rocchesso, “Maximally diffusive yet efficient feedback de-
lay networks for artificial reverberation,” IEEE Signal Pro-
cessing Letters, vol. 4, no. 9, pp. 252–255, 1997.
[8] S. J. Schlecht and E. A. Habets, “Time-varying feedback ma-
trices in feedback delay networks and their application in arti-
ficial reverberation,” The Journal of the Acoustical Society of
America, vol. 138, no. 3, pp. 1389–1398, 2015.
[9] ——, “Practical considerations of time-varying feedback de-
lay networks,” in Audio Engineering Society Convention 138.
Audio Engineering Society, 2015.
[10] ——, “On lossless feedback delay networks,” IEEE Trans-
actions on Signal Processing, vol. 65, no. 6, pp. 1554–1564,
2017.
[11] ——, “Modal decomposition of feedback delay networks,”
arXiv preprint arXiv:1901.08865, 2019.
[12] J.-H. He, “Homotopy perturbation technique,” Computer
methods in applied mechanics and engineering, vol. 178, no.
3-4, pp. 257–262, 1999.
[13] G. Weinreich, “Coupled piano strings,” The Journal of the
Acoustical Society of America, vol. 62, no. 6, pp. 1474–1484,
1977.
[14] J. S. Abel and P. Huang, “A simple, robust measure of rever-
beration echo density,” in Audio Engineering Society Conven-
tion 121. Audio Engineering Society, 2006.
[15] P. Huang, J. S. Abel, H. Terasawa, and J. Berger, “Reverbera-
tion echo density psychoacoustics,” in Audio Engineering So-
ciety Convention 125. Audio Engineering Society, 2008.
[16] A. Lindau, L. Kosanke, and S. Weinzierl, “Perceptual evalua-
tion of model-and signal-based predictors of the mixing time
in binaural room impulse responses,” Journal of the Audio En-
gineering Society, vol. 60, no. 11, pp. 887–898, 2012.
[17] S. J. Schlecht and E. A. Habets, “Feedback delay networks:
Echo density and mixing time,” IEEE/ACM Transactions on
Audio, Speech, and Language Processing, vol. 25, no. 2, pp.
374–383, 2017.
[18] P. A. Regalia and M. K. Sanjit, “Kronecker products, unitary
matrices and signal processing applications,” SIAM review,
vol. 31, no. 4, pp. 586–613, 1989.
[19] J. S. Abel, D. P. Berners, and J. Herrera, “Signal processing
techniques for digital audio effects,” Course Notes for Stan-
ford Music 424, 2004.
[20] E. De Sena, H. Hacιhabibo˘
glu, Z. Cvetkovi´
c, and J. O. Smith,
“Efficient synthesis of room acoustics via scattering delay net-
works,” IEEE/ACM Transactions on Audio, Speech and Lan-
guage Processing (TASLP), vol. 23, no. 9, pp. 1478–1492,
2015.
[21] J. S. Abel and M. J. Wilson, “Luciverb: Iterated convolution
for the impatient,” in Audio Engineering Society Convention
133. Audio Engineering Society, 2012.
[22] S. J. Schlecht and E. A. Habets, “Sign-agnostic matrix de-
sign for spatial artificial reverberation with feedback de-
lay networks,” in AES International Conference on Spatial
Reproduction-Aesthetics and Science. Audio Engineering
Society, 2018.
[23] E. K. Canfield-Dafilou and J. S. Abel, “Resizing rooms in
convolution, delay network, and modal reverberators,” in Pro-
ceedings of the 21st International Conference on Digital Au-
dio Effects, 2018.