PosterPDF Available

Greedy alternative for room geometry estimation from acoustic echoes: A subspace-based method

Authors:

Abstract

Poster for ICASSP 2017
GREEDY ALTERNATIVE FOR ROOM GEOMETRY ESTIMATION FROM
ACOUSTIC ECHOES: A SUBSPACE-BASED METHOD
Mario Coutino1, Martin Bo Møller2,3, Jesper Kjær Nielsen2,3, Richard Heusdens1
Delft University of Technology, The Netherlands
2Bang & Olufsen A/S, Denmark
3Aalborg University, Denmark
Introduction
Knowledge of the room shape can benefit a large num-
ber of applications. For example, in the creation of
personal sound zones [1] one needs to know the room
impulse response (RIR) in different locations, which
could be modeled if the geometry information of the
room is available.
Figure 1: Motivation for room geometry estimation.
Echoes generated by sound reflected from the room
walls carry information about the geometry of the en-
closure. By modeling room reflections using virtual
sources [2], it is possible to exploit the geometric dual-
ity of this representation to estimate the room bound-
aries.
Acoustic Echoes Sorting Problem
In instances where multiple microphones, randomly
placed in the room, are used to detect the acoustic
echos in the RIRs, ambiguities arise at the moment of
labeling the echoes according to the wall which pro-
duces them. This problem is illustrated in Fig. 2.
b
b
b
r1
r3
r2
r1r2r3
d1?/cd1?/c
ttt
d2?/cd2?/c d3?/c d3?/c
s
Figure 2: Ambiguity in the echoes labels due to different order of
arrival of wall reflections
Data Model
The squared distance dm,n for the (m, n)-th
microphone-source pair can be written as
(xmXn)2+ (ymYn)2+ (zmZn)2=dm,n (1)
This can be expressed as an inner product as
RT
mSn=dm,n (2)
where the two vectors Rmand Snare given by
Rm= [rT
mrm2xm2ym2zm1]TR5×1,(3)
Sn= [1 XnYnZnsT
nsn]TR5×1(4)
Collecting all the squared distances dm,n for the pairs
(m, n)leads to the distance matrix
D=RTSRM×N(5)
where R= [R1,...,RM]and S= [S1,...,SN]are
known microphone and unknown image source position
matrices, respectively.
Can we exploit the data model in (5) to avoid an NP-
hard problem and find feasible echoes combinations?
Greedy Method
bbbbbbbbb
f(c)κc
M×NM
bbb
M× |C|
f(c1)/k˜
Dc1k2
2f(c2)/k˜
Dc2k2
2...f(c|C|)/k˜
Dc|C| k2
2
Region that contains true combinations
rank(Ec, ǫ)5
M×N
ǫ
Has unique elements?
D=
˜
D=
˜
DC=
Yes
Try columns sequentially
Yes
Generate ˜
Dfrom D
Correct combination
Wrong combination
Sorting (Ascending order)
Figure 3: Flow of the proposed greedy method for sorting acoustic
echoes.
How we can identify feasible combinations?
Orthogonal projection (Subspace filtering) f(c) = kΠN(R)˜
Dck2
2c[1, . . . , N M],ΠN(R)RT=0.
Rank Constraint for Euclidean Distance Matrices (EDMs) rank(˜
Ec, )5
Results
Reconstruction results for 3D rectangular rooms. The proposed greedy method performs orders of magnitude
faster than the pure graph-based method [3], with comparable estimation accuracy.
100101102103104105106107
10−4
10−3
10−2
10−1
100
Indexes of sorted columns
||ΠR
Dc|| / ||Dc||
σ=1mm
σ=5mm
σ=1cm
σ=3cm
σ=5cm
Figure 4: Columns of ˜
Dsorted by the value
of the projection for different noise levels
0.001 0.005 0.01 0.03 0.05
10-2
10-1
100
101
Peaks position uncertainty (σ) [m]
RMSE [m]
(Modified) GraphBased
SubSpacebased (Greedy)
Figure 5: Estimation error comparison for
M= 9 and N= 6.
0
5
10
15
20
25
30
Number of Microphones
Relative Computational Time
Subspacebased (Greedy)
Graphbased (Subspace Filtering)
Graphbased
567
Figure 6: Comparison of computation time
between the graph-based methods and greedy
approach.
Figure 7: Illustration of a 3D reconstruction of a rectangular room.
Contributions and Conclusion
A Greedy approach for acoustic echoes labeling
using the complementary orthogonal projection of
the receivers’ location matrix is proposed.
Perfect echo labeling through subspace filtering in
the noise free case.
In presence of noise, the combination of the rank
constraint for EDMs, the subspace filtering, and a
sorting strategy allows to greedily label the acoustic
echoes.
The proposed greedy method provides an accuracy
comparable with the current state-of-the-art method
based in graph theory, but at a reduced
computational cost.
Effects of uncertainties in the distance
measurements are shown through numerical
experiments.
References
[1] T. Betlehem, et al. "Personal Sound Zones: Delivering interface-free audio to multiple listeners." IEEE
Signal Processing Magazine 32.2 (2015): 81-91.
[2] J.B. Allen, and D.A. Berkley. "Image method for efficiently simulating small room acoustics." The Journal
of the Acoustical Society of America 65.4 (1979): 943-950.
[3] I. Jager, R. Heusdens, and N.D. Gaubitch. "Room geometry estimation from acoustic echoes using
graph-based echo labeling." 2016 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP). IEEE, 2016.
Article
Full-text available
Sound rendering is increasingly being required to extend over certain regions of space for multiple listeners, known as personal sound zones, with minimum interference to listeners in other regions. In this article, we present a systematic overview of the major challenges that have to be dealt with for multizone sound control in a room. Sound control over multiple zones is formulated as an optimization problem, and a unified framework is presented to compare two state-of-the-art sound control techniques. While conventional techniques have been focusing on point-to-point audio processing, we introduce a wave-domain sound field representation and active room compensation for sound pressure control over a region of space. The design of directional loudspeakers is presented and the advantages of using arrays of directional sources are illustrated for sound reproduction, such as better control of sound fields over wide areas and reduced total number of loudspeaker units, thus making it particularly suitable for establishing personal sound zones.
Article
Full-text available
Image methods are commonly used for the analysis of the acoustic properties of enclosures. In this paper we discuss the theoretical and practical use of image techniques for simulating, on a digital computer, the impulse response between two points in a small rectangular room. The resulting impulse response, when convolved with any desired input signal, such as speech, simulates room reverberation of the input signal. This technique is useful in signal processing or psychoacoustic studies. The entire process is carried out on a digital computer so that a wide range of room parameters can be studied with accurate control over the experimental conditions. A FORTRAN implementation of this model has been included.