Conference PaperPDF Available

Abstract

The problem of sound field reproduction (SFR) in the case where the location and parameters of the primary source are known in advance has been well studied. In this paper, we tackle the problem of SFR when there is uncertainty about the location and radiation pattern of the primary source. To account for various possibilities, we sample from the collection of the likely locations and source parameters, and learn a dictionary that is able to provide a good representation of this class of sound fields. We then design loudspeaker radiation patterns to approximate the learned dictionary. Simulations show that the proposed approach offers better performance to the previously known SFR methods in cases where the location and radiation pattern of the primary source is unknown. The formulation is 2D in free space, with directional loudspeakers and pressure sampling points that correspond to omni microphones. Index Terms— Sound field reproduction, dictionary learning, higher order loudspeakers.
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
LEARNING TO REPRODUCE A SOUND FIELD
Hanieh Khalilian, Ivan V. Baji´
c, and Rodney G. Vaughan
School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada
ABSTRACT
The problem of sound field reproduction (SFR) in the case
where the location and parameters of the primary source are
known in advance has been well studied. In this paper, we
tackle the problem of SFR when there is uncertainty about
the location and radiation pattern of the primary source. To
account for various possibilities, we sample from the collec-
tion of the likely locations and source parameters, and learn
a dictionary that is able to provide a good representation of
this class of sound fields. We then design loudspeaker radi-
ation patterns to approximate the learned dictionary. Simu-
lations show that the proposed approach offers better perfor-
mance to the previously known SFR methods in cases where
the location and radiation pattern of the primary source is un-
known. The formulation is 2D in free space, with directional
loudspeakers and pressure sampling points that correspond to
omni microphones.
Index TermsSound field reproduction, dictionary
learning, higher order loudspeakers.
1. INTRODUCTION
Sound field reproduction (SFR) is the process of generating a
desired sound field which is originating from a primary source
in a region of interest (listening area) using an array of loud-
speakers. The goal of SFR is to minimize the reproduction
error in the listening area by optimizing the system parame-
ters such as the location, radiation patterns, and complex am-
plitudes (driving signals) of the loudspeakers. Some of these
parameters are set before the system operation while others
are meant to be optimized dynamically, in real time, during
the system operation. The former parameters are called static
degrees of freedom (DOFs), and the latter ones are referred to
as dynamic DOFs.
Wave field synthesis (WFS) [1], higher order ambisonics
(HOA) [2], and pressure matching [3,4] are three main classes
of SFR methods. In all three cases, loudspeaker complex am-
plitudes are dynamic DOFs, meant to be computed continu-
ously during run time. Other parameters are static DOFs de-
termined at the design stage, before run time. These parame-
ters can be set intuitively [2,4], or found by optimization. For
example, [3,5–7] optimize the loudspeaker locations at the de-
sign stage (static DOFs), and update the complex amplitudes
of the loudspeakers (dynamic DOFs) at run time using pres-
sure matching. In [8], each loudspeaker radiation pattern is
composed of one monopole and one radial dipole. The radia-
tion patterns are determined at the design stage (static DOFs)
while their complex amplitudes (dynamic DOFs) are updated
during run time by HOA. In [9], the higher order patterns are
computed at the design stage (static DOFs) by approximat-
ing an ideal Acoustic Transfer Function (ATF) matrix, while
in [10], both locations and radiation patterns of loudspeak-
ers are optimized at the design stage by constrained matching
pursuit. In both cases, loudspeaker complex amplitudes are
computed at run time (dynamic DOFs) by pressure matching.
While the above methods consider loudspeaker parame-
ters, such as their locations and radiation patterns, as static
DOFs, there are examples in the literature where radiation pat-
terns are treated as dynamic DOFs. For example, in [11–13],
loudspeaker radiation patterns are treated as dynamic DOFs
and computed at run time by HOA. If the radiation patterns
can be continuously and precisely controlled at run time, the
resulting SFR system would be capable of producing a wider
variety of sound fields. With such a system, if the primary
source moves or changes its own sound field, the loudspeakers
could update their pattern in order to match the directionality
and profile of the new sound field. But this comes at the cost
of added complexity, as well as the challenge of physically re-
alizing loudspeakers whose patterns can be controlled contin-
uously and precisely at run time. For the previously discussed
SFR approaches with fixed radiation patterns, a change in the
presumed primary source location or its sound field would
cause performance degradation.
This paper presents a design strategy for SFR systems
that accounts for the uncertainty about the primary source
and its parameters without the need to continuously update
loudspeaker patterns during run time. Hence, its complex-
ity is equivalent to the first group of SFR systems discussed
above, where the only dynamic DOFs are the loudspeaker
driving signals, while its performance approaches that of the
second group of SFR systems whose dynamic DOFs include
loudspeaker patterns. This is accomplished by considering
a class of sound fields that may be produced by the primary
source with a prior distribution of parameters. A dictionary
is learned that can represent that class of sound fields using
K-SVD [14, 15] and higher-order loudspeaker radiation pat-
terns are designed to approximate the learned dictionary. At
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
run time, only the loudspeaker driving signals are updated by
pressure matching. This strategy is tested by simulation on
2D SFR using the formulation from [16] for a single tone pri-
mary source. It outperforms conventional SFR design when
the primary source parameters do not match those used in the
design stage.
The paper is organized as follows. The K-SVD algorithm
is briefly reviewed in Section 2. Section 3 explains the prob-
lem formulation and the proposed method. Simulations re-
sults and summary are presented in Sections 4 and 5, respec-
tively.
2. DICTIONARY LEARNING BY K-SVD
The K-SVD algorithm was introduced in [14]. For SFR, we
use its complex version [15]. Let yibe an M-dimensional
vector in CM×1, and Dbe an M×Ndictionary matrix in
CM×N. Data vector yiis represented as a sparse linear com-
bination of the columns of the dictionary matrix: yi=Dxi,
where xiis the coefficient vector corresponding to yi. Let
Y= [y1,y2, ..., yK]be the data matrix whose columns are
data vectors, and let X= [x1,x2, .., xK]be the correspond-
ing coefficient matrix. The K-SVD algorithm solves the fol-
lowing optimization problem:
arg min
X,DkYDXk2s.t. kxik0< T for 1iK,
(1)
where k.kiis the i-norm, and Tis the (sparsity-promoting)
constraint on the 0-norm of the coefficient vector. This prob-
lem is solved through the following steps [14,15]:
(a) Updating the coefficient matrix: The dictionary ma-
trix is initialized as D=D(0). The coefficient vectors xi
are found by any pursuit algorithm that solves the following
optimization problem:
arg min
xikyiDxik2s.t. kxik0< T. (2)
Here we use the Orthogonal Matching Pursuit (OMP) [17].
(b) Updating the dictionary matrix: The columns of the
dictionary matrix are updated through the following steps. To
update the i-th column:
1) A group of the data vectors that use this dictionary
member are determined, and their corresponding indicies are
put in the set ωi.
2) The error matrix is calculated as Ei=YDiXwhere
the i-th column of Diis zero and other columns are equal to
those of D.
3) The columns of Eithat correspond to the set ωiare
selected and arranged in ER
i.
4) The Singular Value Decomposition of matrix ER
iis cal-
culated: ER
i=UΣVH.
5) The i-th column of Dis set to the first column of U,
and the coefficients corresponding to this dictionary member
are updated as the first column of Vmultiplied by the first
singular value of ER
i(i.e., Σ(1,1)).
Steps (a) and (b) repeat until the error is less than a desired
value. Matrices Xand Dare the outputs.
3. SFR PROBLEM FORMULATION
Let Nbe the number of loudspeakers, Mbe the number
of sampling points from the listening area, bnand ambe
the locations of the n-th loudspeaker and the m-th sampling
point, respectively. Let p(f) = [p1, p2, ..pM]Tbe the pres-
sure created by the loudspeaker array at sampling points at
frequency f,pdes(f)be the desired field vector (at sampling
points), s(f) = [s1, ...sN]Tbe the vector of complex ampli-
tudes of the loudspeakers, and G(f)be the M×NAcoustic
Transfer Function (ATF) matrix. The (m, n)-th element of
G(f)is gmn(f , bn,am)which is the Green’s function of the
n-th loudspeaker at the m-th sampling point, multiplied by
Lmn(f , bn,am), which is the complex amplitude gain of the
radiation pattern of the n-th loudspeaker at the m-th sampling
point at frequency f. The Green’s function of an omni 2D pat-
tern in free space is well known (e.g. monopole in 2D [16]):
gm,n(f , bn,am) = j
4H(1)
0(kr),(3)
where H(1)
l(·)is the cylindrical Hankel function of order l,
j=1,rm,n =kbnamk2,k= 2πf /C is the wave
number, and Cis the speed of sound.
The pressure vector p(f)at the sampling points can be
expressed by the following matrix equation:
p(f) = G(f)s(f).(4)
As usual in the SFR literature, fis dropped from the notation
for simplicity.
One of the systems (called “benchmark”) in our simu-
lations is composed of omni-directional loudspeakers with
Lmn(f , bn,am) = 1. Other systems in our simulation uti-
lize directive, also known as higher-order, loudspeakers. The
pressure caused by an L-th order loudspeaker located at bn
with complex amplitude sn, at point amis given by [12]:
pm,n(bn,am) = sn
l=L
X
l=L
wl(k)H(1)
l(kr)ej ,(5)
where (r, φ)are the polar coordinates of point amwith re-
spect to bn, and wl’s are the expansion coefficients.
To have a fair comparison (energy-wise) between an L-th
order loudspeaker and an omni-directional loudspeaker, the
expansion coefficients are found such that the total radiated
power is equal in both cases for equal complex amplitudes.
For this 2D case, we first find the far-field radiation patterns
in both cases, and then calculate the integral of the radiation
power around a circle.
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
For small order land large argument kr, the Hankel func-
tion H(1)
l(kr)is approximated as:
H(1)
l(kr)r2
πkr ej(kr(2l+1)π/4) .(6)
Therefore, the pressure caused by a zeroth-order loudspeaker
and an L-th order loudspeaker with complex amplitude of 1
are approximated by (7) and (8), respectively:
pm,n(bn,am) = j
4r2
πkr ej(krπ
4)×1(7)
pm,n(bn,am) = j
4r2
πkr ej(krπ
4)
l=L
X
l=L
w
le
j(2l)π
4ejlφ
=gm,n(bn,am)× Lm,n (bn,am)
(8)
In this equation, w
l= 4jwl. Taking the integrals of the above
equations around a circle, and equating them, leads to:
L
X
l=L|w
l|2= 1.(9)
Therefore, for a fair comparison with omni-directional loud-
speakers, the expansion coefficients of higher-order loud-
speakers should satisfy (9).
Suppose all we know about the primary source is the prior
distribution of its position in a certain region of space Rand
its maximum order Lps according to the decomposition in (5).
We present an algorithm to design an SFR system that is capa-
ble of approximating sound fields produced by such a source.
Our goal is to find expansion coefficients wlin (5) that are
able to provide good approximation to the class of possible
sound fields, subject to the constraint in (9).
First, K-SVD is employed to learn a dictionary that is able
to represent the class of sound fields for such a primary source.
The procedure is presented in Algorithm 1. K1sample points
are drawn from the prior distribution for the location of the
primary source, then each term in (5) originating from the
sampled points is treated as a data vector. A dictionary Dis
learned to represent such data vectors, starting from an initial
dictionary that represents an array of omni-directional loud-
speakers.
Since the learned dictionary Dmight not be physically
realizable in terms of loudspeakers, we employ Algorithm 2
in order to design higher-order loudspeakers that approximate
the elements of D. This completes the design stage of the
SFR system and finalizes its static DOFs.
At run time, the desired field is sampled around the
perimeter of the listening area [16] to produce the desired
field vector pdes. Least Square (LS) pressure matching under
a power constraint is used to find the driving signals sopt for
the loudspeakers. This is accomplished as
sopt = arg min
skGs pdesk2
2s.t ksk2
2pmax,(12)
Algorithm 1 Dictionary learning by K-SVD
Input: R Primary source region
Input: Lps Maximum order of the primary source
Input: {bn}N
n=1 Loudspeaker locations
Output: DLearned dictionary
1: Region Ris sampled at K1points {q1,q2, .., qK1}ac-
cording to the prior distribution of the primary source lo-
cation.
2: M< N points {a
1,a
2, .., a
M}from the listening area
are selected as sampling points for the sound field.
3: for i= 1 to K1do
4: A primary source of order Lps is imagined to be lo-
cated at qi.
5: The pressure created at {a
j}M
j=1 by each term
j/4H(1)
l(kr)ej in (5) is calculated and arranged in
a data vector. Therefore, the number of resulting data
vectors for a primary source at qiis 2Lps + 1.
6: end for
7: Arrange all data vectors in the data matrix Y.
8: Form the initial dictionary matrix D(0) with size M×N.
The columns of D(0) are the ATF vectors of an omni-
directional loudspeaker located at bnto the sampling
points {a
j}.
9: The K-SVD algorithm with Nit iterations is applied on
matrices D(0) and Yto obtain D.
10: return Das the learned dictionary.
where pmax is the upper limit on the normalized power the
loudspeaker array. It should be noted that the number and
locations of the sampling points for the desired field pdes is,
in general, different from the sampling points {a
j}M
j=1 used
in the design. The solution to (12) is:
sopt = (GHG+γI)1GHpdes (13)
where (GHG+γI)1GHis the pseudo-inverse of the ATF
matrix, and γis the regularization parameter. This parameter
is found by the method from [18] in order to limit the loud-
speaker power to pmax. In addition, since the columns of the
ATF matrix are correlated at lower frequencies, this parameter
results in a robust solution for the SFR system.
4. SIMULATION RESULTS
We consider the 2D SFR configuration shown in Fig. 1. N=
15 loudspeakers are located on a linear array between points
(0,2) and (0,2). The listening area is a 2m×2m square lo-
cated 3m away from the loudspeaker array. The region Rof
possible locations of the primary source is R= [15,10]×
[5,10], shown as a green rectangle in Fig. 1. Within this re-
gion, we assume the prior distribution for the primary source
location is uniform (i.e., equally likely to be anywhere in
R), and we sample K1= 400 points in Algorithm 1 uni-
formly. The maximum order of the primary source is set
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
Algorithm 2 Loudspeaker pattern design
Input: DLearned dictionary
Input: {bn}N
n=1 Loudspeaker locations
Input: {a
j}M
j=1 Sampling points in the listening area
Output: {w(ni)}Loudspeaker expansion coefficients
1: Set A={1,2, ..., N }.
2: for i= 1 to Ndo
3: For the dictionary vector di, solve the following opti-
mization problem for all nA:
w(n)= arg min
wkHnw
ndik2
2s.t. kwk ≤ 1
(10)
where Hnis M×(2L+ 1) matrix whose columns rep-
resents j/4H(1)
l(kr)ej φl from the n-th loudspeaker at
bnto the sampling points {a
j}, and w(n)is (2L+1) ×1
vector that contains the expansion coefficients win de-
creasing order of l. The Least Squares (LS) solution is
used for solving this optimization problem (The LS solu-
tion is given later in (13).
4: Select the loudspeaker index as
ni= arg min
nAkHnw(n)dik2
2(11)
5: The loudspeaker corresponding to index niapproxi-
mates the i-th dictionary member diwith expansion co-
efficients of w(ni).
6: Exclude the selected index from set A=A\{ni}.
7: end for
8: return nw(ni)o.
Lps = 5, while the order of the loudspeakers in the array
is set to L= 3 (third order loudspeakers have been demon-
strated in [19]). For dictionary learning, M= 12 sampling
points are distributed uniformly on the perimeter of the square
listening area. For run time testing, M= 24 sampling points
are located on the perimeter of the listening area. For per-
formance evaluation, the reproduction error is calculated at
10,000 points distributed uniformly throughout the listening
area:
Error (dB) = 10 log10 kpdes pk2
2
kpdesk2
2
,(14)
where pdes is the desired field vector, and pis the generated
field vector.
We compare the proposed system against three others:
System 1 is the benchmark system consisting of omni-
directional (zeroth order) loudspeakers, whose driving
signals are updated at run time by pressure match-
ing (13), as in our system.
System 2 uses higher-order (L= 3) loudspeakers,
same as our system, but its patterns are optimized for
Fig. 1. Configuration of the SFR system.
one location of the primary source - the centroid of
its prior distribution, which is (10,0) in our simu-
lations. Each loudspeaker is considered as a combi-
nation of (2L+ 1) loudspeakers with ATF vectors of
j/4H(1)
l(kr)ej from that loudspeaker location to
the sampling points. The expansion coefficients are
calculated by solving (13) with γ= 108, and using
the resulting complex amplitudes as w
l’s (and conse-
quently wl’s in (5)). They are normalized to satisfy (9)
and kept fixed from then on. At run time, driving sig-
nals are updated by pressure matching (13), as in our
system.
System 3 is an extension of System 2, but with the abil-
ity to change the loudspeaker patterns at run time; it is
an “ultimate” SFR system. Each loudspeaker is again
considered as a combination of (2L+ 1) loudspeak-
ers with ATF vectors of j/4H(1)
l(kr)ej from that
loudspeaker location to the sampling points. The driv-
ing signals, which now include expansion coefficients
in (5), are computed by pressure matching (13) at run
time. This system has 2L+ 1 = 7 times the number
of dynamic DOFs of the previous systems. Hence, it
is more complex (at run time), but also offers ultimate
flexibility in approximating the desired sound field.
In the first experiment, an omni-directional primary
source emitting a single tone at f= 1000 Hz is located at
(5,3), at the edge of region R. The real part of the desired
field and the sound fields produced by the four SFR systems
with pmax = 0.5are shown in Fig. 2. Visually, it is clear that
the fields in Fig. 2(b) and Fig. 2(e) approximate the desired
field quite well, while those in Fig. 2(c) and Fig. 2(d) provide
poorer approximation. This is confirmed by the reproduction
error (14), which is 28 dB for the proposed system, and
for Systems 1-3 it is 7dB, 17 dB, 35 dB, respectively.
Hence, the proposed system outperforms Systems 1 and 2,
which have the same run-time complexity (same number of
dynamic DOFs), and is slightly worse than System 3, which
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
x
3 3.5 4 4.5 5
y
-1
-0.5
0
0.5
1
(a)
x
3 3.5 4 4.5 5
y
-1
-0.5
0
0.5
1
(b)
x
3 3.5 4 4.5 5
y
-1
-0.5
0
0.5
1
(c)
x
3 3.5 4 4.5 5
y
-1
-0.5
0
0.5
1
(d)
x
3 3.5 4 4.5 5
y
-1
-0.5
0
0.5
1
(e)
Fig. 2. Real parts of the (a) desired field, and fields produced
by (b) proposed system, (c) System 1, (d) System 2, and (e)
System 3 in the listening area. The degradation of (c) and (d)
are clearly visible here.
has 7 times the number of dynamic DOFs.
To explain the reason behind such performance, the far-
field array radiation patterns of the four systems are shown in
Fig. 3. Since the primary source is at (5,3), the listening
area is to the south-east of it, and an efficient loudspeaker
array would send most of its power in that direction. In Fig. 3
we see that this is true for the proposed system, System 2 and
3, while System 1 wastes too much power in other directions.
The second experiment examines SFR at various frequen-
cies. The radiation patterns of the proposed system and Sys-
tem 2 are optimized at each frequency as described above, us-
ing a uniform prior distribution for the primary source. How-
ever, the primary source location is randomly chosen accord-
ing to the probability density function shown in Fig. 4, which
is a truncated isotropic 2D Gaussian centered at (10,5) with
standard deviation of 1. The primary source is assumed to be
a5-th order loudspeaker (5) with randomly chosen expansion
coefficients. Specifically, each coefficient is a complex ran-
dom variable with magnitude uniformly distributed in [0,1]
and phase uniformly distributed in [0,2π), and then normal-
ized as in (9). The SFR power limit is set as pmax = 0.5
for all systems. The reproduction error (14) is averaged over
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
(a)
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
(b)
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
(c)
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
(d)
Fig. 3. Normalized far-field linear amplitude pattern of the
loudspeaker array for (a) proposed system, (b) System 1, (c)
System 2, (d) System 3. The array patterns are much higher
order than the element order, L.
1000 primary source realizations at each frequency and shown
in Fig. 5. According to the figure, the proposed system outper-
forms Systems 1 and 2 except at very low frequencies. The
reason is that at low frequencies, data matrix Yin K-SVD
contains highly correlated columns, which degrades dictio-
nary learning performance. At these frequencies, since all
data vectors are similar, the convergence rate of the dictionary
learning algorithm is slower. In our simulations, the number
of iterations are fixed across all frequencies. By increasing
the number of iterations, it is expected to observe better per-
formance at these frequencies. At higher frequencies this is
not an issue, and the proposed system approaches the perfor-
mance of System 3.
5. SUMMARY
We have presented a design strategy for SFR that accounts for
uncertainty in the primary source parameters. The idea is to
learn a dictionary that is able to provide a good representation
for the class of sound fields that may be generated under the
various possible primary source configurations, and then ap-
proximate the dictionary members by higher-order loudspeak-
ers. The system was evaluated as 2D free space configura-
tion. Real-world systems are 3D of course, but the 2D system
behaves in the same way and allows quick comparisons be-
tween different designs. Compared to conventional SFR de-
sign strategies, the proposed approach is shown to offer better
performance in cases where the actual primary source param-
Presented at the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2016.
Vietri sul Mare, Salerno, Italy, September 13-16, 2016.
x
-15 -10 -5
y
-10
-5
0
5
10
Fig. 4. Probability density function of the primary source lo-
cation.
Frequency (Hz)
0 500 1000 1500
Error (dB)
-60
-40
-20
0
Proposed Method
System 1
System 2
System 3
Fig. 5. Average reproduction error vs. frequency for a higher-
order primary source with random pattern and location over
1000 runs.
eters differ from those used in the design, while at the same
time maintaining the same run-time complexity.
6. REFERENCES
[1] J. Ahrens and S. Spors, “Sound field reproduction using planar
and linear arrays of loudspeakers,” IEEE Trans. Audio, Speech,
and Language Processing, vol. 18, pp. 2038–2050, Nov. 2010.
[2] D. Ward and T. Abhayapala, “Reproduction of a plane-wave
sound field using an array of loudspeakers,” IEEE Trans. Audio,
Speech, and Language Processing, vol. 9, pp. 697–707, Sep.
2001.
[3] G. N. Lilis, D. Angelosante, and G. B. Giannakis, “Sound field
reproduction using lasso,” IEEE Trans. Audio, Speech, and
Language Processing, vol. 18, pp. 1902–1921, Nov. 2010.
[4] P. Gauthier and A. Berry, “Sound-field reproduction in-room
using optimal control techniques: Simulations in the frequency
domain,” The Journal of the Acoustical Society of America, vol.
2, pp. 662–678, Feb. 2005.
[5] F. Asano, Y. Suzuki, and D. C. Swanson, “Optimization of con-
trol source configuration in active control systems using gram-
schmidt orthogonalization,” IEEE Trans. Speech and Audio
Processing, vol. 7, no. 2, pp. 213–220, 1999.
[6] H. Khalilian, I. V. Baji´
c, and R. G. Vaughan, “Towards optimal
loudspeaker placement for sound field reproduction,” in Proc.
IEEE ICASSP’13, Vancouver, May 2013, pp. 321–325.
[7] H. Khalilian, I. V. Baji´
c, and R. G. Vaughan, “Loudspeaker
placement for sound field reproduction by constrained match-
ing pursuit,” in Proc. IEEE WASPAA’13, New Paltz, NY, Oct.
2013.
[8] M. A. Poletti, F. M. Fazi, and P. A. Nelson, “Sound-field re-
production systems using fixed-directivity loudspeakers, The
Journal of the Acoustical Society of America, vol. 127, no. 6,
pp. 3590–3601, 2010.
[9] H. Khalilian, I. V. Baji´
c, and R. G. Vaughan, “3D sound field re-
production using diverse loudspeaker patterns, in Proc. IEEE
ICME’13, San Jose, CA, Jul. 2013.
[10] H. Khalilian, I. V. Bajic, and R. G. Vaughan, “Joint optimiza-
tion of loudspeaker placement and radiation patterns for sound
field reproduction,” in Proc. IEEE ICASSP’15, Brisbane, April
2015, pp. 519–523.
[11] M. A. Poletti, F. M. Fazi, and P. A. Nelson, “Sound repro-
duction systems using variable-directivity loudspeakers, The
Journal of the Acoustical Society of America, vol. 129, no. 3,
pp. 1429–1438, 2011.
[12] M. A. Poletti and T. D. Abhayapala, “Spatial sound reproduc-
tion systems using higher order loudspeakers,” in Proc. IEEE
ICASSP’11, Prague, May 2011, pp. 57–60.
[13] P. N. Samarasinghe, M. A. Poletti, S. M. Salehin, T. Abhaya-
pala, and F. M. Fazi, “3D soundfield reproduction using higher
order loudspeakers,” in Proc. IEEE ICASSP’13. IEEE, 2013,
pp. 306–310.
[14] M. Aharon, M. Elad, and A. Bruckstein, “An algorithm for
designing overcomplete dictionaries for sparse representation,
IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 4311–4322,
2006.
[15] S. Ravishankar and Y. Bresler, “MR image reconstruction
from highly undersampled k-space data by dictionary learning,”
IEEE Trans. Medical Imaging, vol. 30, no. 5, pp. 1028–1041,
2011.
[16] M. Poletti and T. Abhayapala, “Interior and exterior sound field
control using general two-dimensional first-order sources,”
The Journal of the Acoustical Society of America, vol. 129, no.
1, pp. 234–244, 2011.
[17] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal
matching pursuit: recursive function approximation with appli-
cations to wavelet decomposition, in Proc. Asilomar’93, Nov
1993, pp. 40–44.
[18] T. Betlehem and C. Withers, “Sound field reproduction with
energy constraint on loudspeaker weights, IEEE Trans. Au-
dio, Speech, and Language Processing, vol. 19, pp. 2388–2392,
Oct. 2012.
[19] M. Poletti and T. Betlehem, “Design of a prototype variable
directivity loudspeaker for improved surround sound reproduc-
tion in rooms,” in Proc. 52nd AES Conf. Sound Field Control-
Engineering and Perception, 2013, paper no. 6-1.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This paper presents a comparison between several loudspeaker placement methods for Sound Field Reproduction (SFR). The goal of these placement methods is to reduce the SFR error under a power constraint by selecting suitable locations for the loudspeakers. The first method is based on singular value decomposition of the Acoustic Transfer Function (ATF) matrix. Depending on the configuration, an ideal ATF matrix is created and then approximated by selecting the appropriate locations for the loudspeakers. Another method is based on the constrained matching pursuit (CMP) algorithm, in which candidate locations of the loudspeakers are selected iteratively to minimize the approximation error of the desired sound field. The third method is based on sparsity-promoting sound field approximation using the Least absolute shrinkage and selection operator (Lasso). Loudspeaker placements obtained using these methods are compared against benchmark configuration of uniformly distributed loudspeakers. The comparison indicates that for constrained power, the CMP-based placement has the least reproduction error.
Conference Paper
Full-text available
Three dimensional surround sound reproduction over large areas is a prevailing challenge due to the enormous numbers of loudspeakers required. In this paper, we propose an array of higher order loudspeakers which provide a mode matching solution to the problem based on 3D wavefield translation. It is shown that for a given bandwidth, the use of Lth order sources significantly brings down the minimum loudspeaker requirement by a factor of 1=(L + 1)2. Furthermore, the array is shown to be capable of exterior field cancellation, increasing its performance in echoing environments. Design examples are given for interior field, exterior field and interior and exterior combined field reproduction.
Conference Paper
Full-text available
Sound reproduction systems aim to produce a desired sound field over a region of space. At high frequencies, the number of loudspeakers required is prohibitive. This paper shows that the use of Nth order loudspeakers, in which each loudspeaker produces polar responses up to cos(Nφ) and sin (Nφ) , produces accurate reproduction over N times the area of a first order array and can largely eliminate any exterior field. This allows a significant reduction in the number of loudspeaker units, at the expense of increased complexity in each loudspeaker unit.
Article
Loudspeakers with variable directivity offer the potential for improved sound reproduction quality in rooms by reducing sound reflections from walls, providing better reconstruction of the desired sound field than an equivalent array of monopole loudspeakers and by allowing optimum active compensation of reverberation. This paper discusses the design of a variable directivity loudspeaker array for 2D surround reproduction. The loudspeaker uses fifteen drivers in a circular array and Fourier beam-forming to generate up to third order responses in azimuth. Azimuthal and vertical polar responses are presented which quantify the performance of the loudspeaker.
Article
A new method is presented for optimizing loudspeaker placement, radiation patterns, and excitations for Sound Field Reproduction (SFR). A power constraint is included which is to help control the sound increase in the regions away from listening volume. For a known primary source, the loudspeaker locations and patterns are jointly optimized by Constrained Matching Pursuit (CMP), and this can be undertaken offline, i.e., before system operation. The excitations of the designed loudspeakers are then determined by conventional Lagrangian optimization. Simulations for free space conditions show that the new method yields a lower reproduction error under a power constraint than other SFR methods.
Conference Paper
Sound field reproduction (SFR) in a cubic region is addressed using a finite planar array of loudspeakers with diverse radiation patterns. An acoustic transfer function (ATF) matrix that best approximates the desired field under a power constraint is found from singular value decomposition (SVD). The loudspeaker patterns are then derived to match the ATF. Free-space-based simulations indicate that in 3D SFR, higher order loudspeakers offer better reproduction fidelity than low-order patterns, as can be expected from the extra degrees of freedom, and as previously reported for 2D SFR. The optimized patterns turn out to be highly diverse, even if they are all of the same order. These would have to be implemented from sub-arrays or reconfigurable loudspeaker designs.
Conference Paper
We describe a method for approximating a desired sound filed in a cubic region using a planar array of omnidirectional loudspeakers. For this purpose, a constrained matching pursuit algorithm is employed to find the appropriate locations of the loudspeakers. Unlike previously proposed methods for sound field approximation, this iterative procedure attempts to approximate the residual error vector at each iteration, leading to a more efficient representation of the desired field as a linear combination of the Acoustic Transfer Functions (ATFs) of the selected loudspeakers. Simulations suggest that the new method offers considerable improvement in approximation accuracy compared to uniformly placed loudspeakers, as well as another recent method for loudspeaker placement.
Conference Paper
With a view to the suppression of unwanted sound, a planar array of loudspeakers is used to recreate a sound field in a nearby cubic listening area. Using free-space propagation, the formulation for selecting optimal locations of loudspeakers is developed so that numerical experiments can give a feel for the best possible suppression. First, to provide a benchmark, a target Acoustic Transfer Function (ATF) matrix is found that minimizes the reproduction error for a number of uniformly placed loudspeakers. Then the same number of loudspeakers is positioned by selecting from densely placed candidates so that their ATF matrix best approximates the target sound field. For recreating the field of a point source located in the direction of the array, the loudspeaker array with selected locations is shown to improve the reproduction accuracy over a reasonable bandwidth.
Article
Audio rendering problems are not always well-posed. An approach is devised for solving ill-posed sound field reproduction problems using regularization, where the Tikhonov parameter is chosen by upper bounding the summed square of the loudspeaker weights. The method ensures that the sound in the room remains at reasonable levels.
Conference Paper
We describe a recursive algorithm to compute representations of functions with respect to nonorthogonal and possibly overcomplete dictionaries of elementary building blocks e.g. affine (wavelet) frames. We propose a modification to the matching pursuit algorithm of Mallat and Zhang (1992) that maintains full backward orthogonality of the residual (error) at every step and thereby leads to improved convergence. We refer to this modified algorithm as orthogonal matching pursuit (OMP). It is shown that all additional computation required for the OMP algorithm may be performed recursively