ArticlePDF Available

High-Resolution Ultrasound Sensing for Robotics Using Dense Microphone Arrays

Authors:

Abstract and Figures

State-of-the-art autonomous vehicles use all kinds of sensors based on light, such as a camera or LIDAR (Laser Imaging Detection And Ranging). These sensors tend to fail when exposed to airborne particles. Ultrasonic sensors have the ability to work in these environments since they have longer wavelengths and are based on acoustics, making them able to pass through the mentioned distortions. However, they have a lower angular resolution compared to their optical counterparts. In this article a 3D in-air sonar sensor is simulated, consisting of a Uniform Rectangular Array similar to the newly developed micro Real Time Imaging Sonar (μRTIS) by CoSys-Lab. Different direction of arrival techniques will be compared for an 8 by 8 uniform rectangular microphone array in a simulation environment to investigate the influence of different parameters in a completely controlled environment. We will investigate the influence of the signal-to-noise ratio and number of snapshots to the angular and spatial resolution in the direction parallel and perpendicular to the direction of the emitted signal, respectively called the angular and range resolution. We will compare these results with real-life imaging results of the μRTIS. The results presented in this article show that, despite the fact that in-air sonar applications are limited to only one snapshot, more advanced algorithms than Delay-And-Sum beamforming are viable options, which is confirmed with the real-life data captured by the μRTIS.
Content may be subject to copyright.
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier DOI
High-Resolution Ultrasound Sensing for
Robotics using Dense Microphone
Arrays
THOMAS VERELLEN12 , ROBIN KERSTENS12, AND JAN STECKEL12, (Member, IEEE)
1FTI-CoSys Lab, University of Antwerp, Antwerp, Belgium
2Flanders Make Strategic Research Centre, Lommel, Belgium
Corresponding author: Jan Steckel (e-mail: jan.steckel@uantwerpen.be).
ABSTRACT State-of-the-art autonomous vehicles use all kinds of sensors based on light, such as a
camera or LIDAR (Laser Imaging Detection And Ranging). These sensors tend to fail when exposed to
airborne particles. Ultrasonic sensors have the ability to work in these environments since they have longer
wavelengths and are based on acoustics, making them able to pass through the mentioned distortions.
However, they have a lower angular resolution compared to their optical counterparts. In this paper a 3D
in-air sonar sensor is simulated, consisting of a Uniform Rectangular Array similar to the newly developed
micro Real Time Imaging Sonar (µRTIS) by CoSys-Lab. Different direction of arrival techniques will be
compared for an 8 by 8 uniform rectangular microphone array in a simulation environment to investigate the
influence of different parameters in a completely controlled environment. We will investigate the influence
of the signal-to-noise ratio and number of snapshots to the angular and spatial resolution in the direction
parallel and perpendicular to the direction of the emitted signal, respectively called the angular and range
resolution. We will compare these results with real-life imaging results of the µRTIS. The results presented
in this paper show that, despite the fact that in-air sonar applications are limited to only one snapshot, more
advanced algorithms than Delay-And-Sum beamforming are viable options, which is confirmed with the
real-life data captured by the µRTIS.
INDEX TERMS Acoustic signal processing, array signal processing, beamforming, microphone arrays,
sonar.
I. INTRODUCTION
Autonomous vehicles mostly use optical techniques to per-
ceive their environment. However, these sensors tend to fail
when exposed to airborne particles [1], [2]. Therefore, it is of
interest to further complement the data gathered with optical
techniques with information gathered from ultrasonic sen-
sors. These sensors use acoustical waves with relatively long
wavelengths which pass through the propagation medium’s
distortions. However, there are downsides to using these ul-
trasonic sensors. Sound waves travel slower in air compared
to light (343 m/s compared to 299.8×106m/s), as a result,
taking one measurement over a range of six meters takes
35 ms. This makes gathering multiple snapshots impossible
since it is important that the environment stays completely
stationary, which cannot be guaranteed for mobile applica-
tions. Additionally, because of the longer wavelengths and
acoustics that are used, the angular resolution cannot compete
with the accuracy of alternatives such as LIDAR or a regular
camera [3].
The embedded Real Time Imaging Sonar (eRTIS) is a
high-accuracy 3D sonar sensors based on a 32 pseudo-
random microphone-array which use Delay-And-Sum (DAS)
beamforming to solve the Direction Of Arrival (DOA) prob-
lem [4]–[6]. Recently, a smaller version of the eRTIS was
developed to satisfy the need for 3D in-air sonar sensors.
This new version features a smaller footprint for all kind
of robotic applications in need for cost-effective 3D en-
vironmental perception which do not rely on optics. This
smaller version is named the micro Real Time Imaging Sonar
(µRTIS) [7]. The µRTIS features 30 microphones placed in
Uniform Rectangular Array (URA) of 5 by 6 microphones in
a footprint of 5.7 cm by 4.6 cm.
The original eRTIS used DAS which is known for its ease
of use, low computational need, and low angular resolution.
VOLUME 4, 2016 1
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
To improve the angular resolution, the µRTIS uses MUlti-
ple SIgnal Classification (MUSIC) algorithm [8]. MUSIC
requires more computational effort than DAS but has the
prospect to improve the angular resolution of the 3D sonar
sensor. However, MUSIC fails to correctly resolve the dif-
ferent sources in coherent-source spaces [9], [10]. This is
because the decomposition of the source correlation matrix
used by MUSIC requires this matrix to be of full rank. In the
presence of Kuncorrelated sources, the Khighest eigenval-
ues will correspond with the signal subspace. The eigenvec-
tors corresponding to the remaining eigenvalues make up the
noise subspace [11]. In a coherent subspace, the eigenvalues
of perfectly coherent signals will be zero and MUSIC will
fail to correctly resolve these sources. Since the reflected
signals originating from the ultrasonic sensor are all coherent,
we will preprocess them by means of spatial smoothing.
Spatial smoothing averages the source correlation matrix by
dividing the microphones in several identical overlapping
arrays, which restores its rank [12]–[14].
The aim of this paper is to introduce the µRTIS, the newly
developed 3D in-air sensor from CoSys-Lab. In this paper we
analyze its performance with respect to various beamforming
techniques such as DAS, Capon, and MUSIC. We apply these
techniques to a simulated signal of an in-air sonar sensor in a
completely controlled environment. This simulation environ-
ment enables us to more accurately research the influence of
different parameters which is not possible in real-life. The
results of these simulations can be used in future work to
further improve the accuracy of the newly developed µRTIS.
We use MUSIC for its ability to outperform other algorithms
even when using only one snapshot [15]. We also include
a form of Forward-Backward Spatially Smoothed Capon
(FBSS) because it has the advantage that the amplitude
information of the calculated spectrum is still present while
also being known for having a good angular resolution [16].
This amplitude information is lost when using any type of
MUSIC because of the applied eigendecomposition of the
covariance matrix. The Capon beamformer is also forward-
backward spatially smoothed to ensure the localization of the
present coherent sources. We also include the first real-life
processed imaging results of the µRTIS while using different
beamforming algorithms which were first presented in [7].
This paper is structured as follows: firstly, we explain the
used signal model and used algorithms in Section III. Then
we clarify the methodology in Section IV. In Sections V-A
and V-B we investigate respectively the influence of the
Signal-to-Noise Ratio (SNR) and number of snapshots. Next,
we investigate the angular separation in Section V-C followed
by the investigation into the resolvability of multiple sources
at a same distance for the different algorithms in Section V-D,
ending with real-life results in Section V-E. Finally, in Sec-
tion VI, we state the conclusion of this paper.
II. µRTIS
The µRTIS is the successor of the previously developed
eRTIS. The eRTIS was based on Poisson Disc-Sampling and
features a 32 pseudo-random microphone array. It uses DAS
to solve the DOA problem and has a footprint of 12.2 cm
by 10.8 cm. To support the use of their 3D in-air sonar
sensor in mobile robotic applications in which the relatively
large size could be an issue, CoSys-Lab used their experience
of the eRTIS and developed a smaller version: the µRTIS.
The µRTIS features 30 microphones placed in a uniform
rectangular array of 5×6and a smaller emitter, reducing
its size to 5.7 cm by 4.6 cm. A side by side view of the
two sensors is visible in Fig. 2. The use of a URA enables
high resolution imaging techniques to be applied. Since the
microphones are placed with a spacing of 3.85 mm, we can
spatially localize ultrasound sources with frequencies up to
44.545 kHz. Currently we use spatially smoothed MUSIC
to process the recorded data, which requires a URA and
therefore was not an option with the eRTIS. Initial real-life
results and a more in-depth description of the hardware and
processing of the µRTIS is available in [7]. However, as pre-
viously mentioned we will in this paper use a simulation of a
rectangular array of 8×8which has been spatially smoothed
with subarrays of 5×5. This research environment enables
us to research the influence of different parameters to the
imaging results of different DOA-methods in a completely
controlled environment. These simulation results can then
be used to further improve the imaging of the µRTIS. We
performed the simulation results on an 8×8URA, compared
to the 5×6URA of the µRTIS, the results hereof will be
better than when we apply the same techniques to real-life
data. However, the same principles apply.
III. SIGNAL PROCESSING FLOW
A. SIGNAL MODEL
The following is a description of the signal model used in the
simulation environment. This model will replace the received
signals of step (a) in Fig. 1. Assume a 2D array of M
microphones and an emitter close to the array’s phase center,
emitting a signal se(t). The signal used in the simulation
was a frequency sweep ranging from 20 kHz to 85 kHz.
This signal gets reflected by the Kpoint reflectors in the
environment and recorded by the microphone array. We can
write the m-th microphone signal as:
sm(t) =
K
X
k=1
ak·se(ttk,m) + n(t),(1)
in which the time delay tk,m is caused by the range rk,m
between the m-th microphone and k-th source and n(t)is
the noise vector. The attenuation of the signal akis due
to a number of effects of which the most important ones
are absorption by the reflector and attenuation with distance
which in the free field is equal to 1
r[18]. With rbeing
the travelled distance of the acoustic wave, which translates
to 1
2rk,m for our active sonar application. To focus on the
performance of the beamforming algorithms we chose not
to take absorption, diffraction or reverberation into account,
these effects are an interesting topic for future work. The
2VOLUME 4, 2016
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
N channels
Matched Filter complex
STFT
N channels
r
f
t
MUSIC
r
t
(θ, ϕ)
(a) (b) (c) (d) (e) (f)
N
one snapshot
FIGURE 1. Schematic overview of the steps that are executed by the signal processing flow. An adapted version of the processing flow in [17] where the coordinate
system used in the simulation is defined on the eRTIS. (a) An eRTIS 3D imaging sonar performs an active measurement of the environment. (b) A matched filter is
used to maximize the SNR. (c) Using complex STFT we can extract the frequency information of the recorded signal and estimate the range of the reflections (d)
The one snapshot that will be used consists out of the range-(single)frequency information across multiple channels. (e) This data is used as the input of an angle
estimation algorithm, in this figure MUSIC was used. (f) Finally we get a 3D image showing the location of the reflectors.
Microphone Emitter
Array
Z
Y
X
10.8 cm
12.2 cm
5.7 cm
4.6 cm
FIGURE 2. Figure from [7] showing a side by side view of the µRTIS (left) and
the eRTIS (right), which are two 3D imaging sonar sensors developed by
CoSys-Lab. The microphone array in the eRTIS sensor consists of a
pseudo-random 32 microphone array placed using Poisson Disc-Sampling,
whereas the microphone array in the µRTIS consists of a 30 microphone URA
of 5×6.
noise vector n(t)is used to add white Gaussian noise to the
model, making it possible to change the SNR of our recorded
signal:
SN R = 10 ·log10 Psm(t)
Pn(t)[dB], (2)
where the Psm(t)and Pn(t)respectively are the power of the
signal and the power of the background noise. Our simulated
environment consists of point sources in the free field from
which the reflected signals only follow one clear path to our
sonar sensor. Fig. 3 illustrates the received signal of a source
placed at a distance of one meter, using a spectrogram with a
window size of 256 and an overlap of 240 samples.
Spect rogram of recei ved sig nal, SNR of 20 d B, sour ce at (30°, 30°, 1 m)
0.9 0.95 1 1.05 1.1 1.15 1.2 1.25
Distance [m]
0
10
20
30
40
50
60
70
80
90
100
Frequency [kHz]
5
10
15
20
25
30
35
40
45
50
55
Amplitude [dB]
FIGURE 3. Spectrogram of the received signal with a SNR of 20 dB. The
source (azimuth, elevation, range) is placed at (30°, 30°, 1 m). The received
signal shows the frequency chirp starting at around 1 m and 80 kHz to 1.2 m
and 20 kHz.
Once the signal reaches the individual microphones, it first
gets matched filtered to maximize the SNR and estimate the
time of arrival (range) [19]:
sMF
m(t) = F1[F(sm(t)) · F (se(t))].(3)
Fig. 4 shows the output of the matched filter for a single
microphone for the received signal from Fig. 3. In this case,
where the source is placed at a distance of 1 m and not right in
front of the sensor, the distance is estimated at 1.003 m. More
importantly, the Rayleigh width [20], which will be discussed
further in Section IV, decreases from approximately 20 cm to
around 6 mm.
VOLUME 4, 2016 3
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
0.9 0.95 1 1.05 1.1 1.15 1.2 1.25
Distance [m]
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
Amplitude [-]
Norm alized mat ched fi lter ou tput, SNR o f 20 dB, so urce at (30°, 30°, 1 m )
Distance = 1.003 m Signal
Upper envelope
FIGURE 4. Normalized output of the matched filter for a source (azimuth,
elevation, range) placed at (30°, 30°, 1 m). The maximum value is reached at a
distance of 1.003 m.
This filtered signal is transformed to the time-frequency
domain using a Short-Time Fourier Transform (STFT). We
select the frequency bin which has the highest energy (ω1)
and use this frequency for all of our snapshots. Next, we
filter out the time bins for which the total energy is less
than half of the highest energy, leaving us with a limited
collection of time (range) slots for a certain frequency and
a certain snapshot. This results in a single snapshot ~x(ω, t)
for each recording, where the time tmatches a distance from
the sensor and ωis fixed to ω1. Gathering multiple snapshots
by measuring Ktimes results in the matrix notation:
X(ω, t) = ~x1(ω1, t)~x2(ω1, t)· · · ~xK(ω1, t).(4)
However, as previously mentioned we are limited to only one
snapshot resulting in X(ω, t)being equal to ~x1(ω1, t).
B. SIMULATION PARAMETERS
The simulated sensor consists of 64 (8×8) omnidirectional
microphones placed in a Uniform Rectangular Array (URA).
The spacing in between the microphones is 5 mm in both
directions. The signal emitted by the sonar sensor is a fre-
quency sweep, sweeping from 20 kHz to 85 kHz and a
duration of 1 ms. By placing our microphones 5 mm apart.
C. DOA METHODS
Following the STFT, we have now reached step (e) in Fig. 1.
The gathered snapshot serves as an input for the beamform-
ing algorithms. We have chosen three different algorithms
to calculate the DOAs of the simulated signal, i.e., DAS,
MUSIC, and Capon. DAS is based on the difference of the
phase of the signal as a result of the spacing between the
microphones. Its spectrum is defined as
PDAS (θ, φ) = X(ω, t)·A(θ, φ),(5)
in which A(θ, φ)is the steering matrix defined for azimuth
θand elevation φ, steering our signal into all the directions
of interest which in our case is equal to the complete frontal
hemisphere. We defined
A(θ, φ) = a(θ1, φ1)a(θ2, φ2). . . a(θM, φM),(6)
where a(θ, φ)is a single steering vector:
ψY,m =sin(θ)·cos(φ)·py,m
ψZ,m =sin(φ)·pz,m
a(θ, φ) = he
2πj
λ·(ψY,1+ψZ,1)· · · e
2πj
λ·(ψY,M +ψZ,M )i.
(7)
In which λis equal to the wavelength of the selected fre-
quency and [py,m, pz ,m]is equal to the location of the m-
th microphone, with the azimuth and elevation defined as in
Fig. 1.
Besides DAS, we also used MUSIC [8] to solve the DOA
problem:
PMU S IC (θ, φ) = 1
A(θ, φ)H·E·EH·A(θ, φ),(8)
it is based on the eigenvalue decomposition of the covariance
matrix Rof X(ω, t). Where A(θ, φ)Hin (8) is equal to
the hermitian transpose of A(θ, φ). It solves the different
DOAs by calculating the noise subspace Ewhich are the
eigenvectors of Rbelonging to the noise. The noise subspace
is formed by the eigenvectors of Rcorresponding with eigen-
values that are smaller than the mean of all eigenvalues. The
estimation results of the MUSIC algorithm are dependent by
six parameters in particular: The number of array elements,
spacing between the elements, number of snapshots, SNR,
angle spacing, and coherency of the signals. In the course of
this paper we will keep the number of array elements and
spacing between them constant while all of our signals are
coherent. The influence of the remainder of these parameters
will be the main subject of this paper.
The last beamformer used is Capon, to be specific we
use the version of Capon known as the Minimum Power
Distortionless Response (MPDR) [21] with diagonal loading:
Rb=R+IM·b
w0(θ, φ) = R1
b·A(θ, φ)
A(θ, φ)H·R1
b·A(θ, φ)
PMP D R(θ, φ) = w0(θ , φ)H·Rb·w0(θ, φ).
(9)
Where IMis the identity matrix of size M×M,bis the
value for diagonal loading and equal to -0.5 since it is best
kept small and negative [22] and R1is the inverse of the
estimate of the covariance matrix [23].
D. SPATIAL SMOOTHING
Since the use of a sonar sensor makes most of the signal
sources coherent, spatial smoothing will be applied to over-
come this limitation of the MUSIC algorithm [12], [24].
Spatial smoothing is one of the most used methods for this
purpose because of its simplicity and small computational
4VOLUME 4, 2016
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
demand. Forward Spatial Smoothing (FSS) averages the sig-
nal of a microphone with that of its neighbors. This decorre-
lates the received signals and improves the performance of
the DOA-estimators [25], [26]. Forward spatial smoothing
requires a URA in the presence of Kcoherent sources to be
at least of size 2K×2K. To guarantee the localization of
all present sources, forward-backward spatial smoothing will
also be implemented, requiring a minimum size of 3
2K×3
2K
for the URA [27]. Besides MUSIC, the Capon beamformer
will also be forward-backward spatially smoothed to ensure
the localization of the present coherent sources. In this paper,
the sensor consisting of 8×8microphones will be divided in
subarrays consisting of 5×5microphones.
IV. METHODOLOGY
We will investigate the accuracy of different DOA algorithms
in function of different properties using a sonar sensor in a
simulation environment. The angular accuracy will be de-
fined by the 3 dB area on the sensor’s Point Spread Function
(PSF) as visible in Fig. 5. Since this is an area on a sphere,
it is defined in steradians (sr). The range resolution will
be defined as the distance between the 3 dB points of the
source. The DOA algorithms that will be compared are DAS,
Capon FBSS, MUSIC, MUSIC FSS, and MUSIC FBSS. For
which we selected the frequency with the highest energy to be
used in (7), which was equal to 31.250 kHz. This frequency
satisfies the half-wavelength criteria for our array where the
distance between the microphones is equal to 5 mm.
Range [m]
0.97
0.99
1.02
1.03
Angular resolution
40
35
30
25
20
15
10
5
0
Amplitude [dB]
FIGURE 5. Spatial image of a 3D in-air sonar sensor using the DAS algorithm
with a source (azimuth, elevation, range) placed at (45°, 45°, 1 m), a SNR of
10 dB, and one snapshot. The four slices match four sequential time slices
(ranges) of the same recording. The angular resolution is defined as the
surface on the sphere formed by the 3 dB area in steradians. The range
resolution is defined as the distance between the 3 dB points from the different
slices.
First, we investigate the influence of the SNR of the
received signal. We conduct three experiments, each exper-
iment has one source at a distance of 1 m. Every experiment
will let the SNR range from -16 dB to 20 dB and is repeated
20 times. The first experiment will have a source (azimuth,
elevation, range) at (0°, 0°, 1 m), the second a source at (30°,
30°, 1 m), and the final experiment will have a source placed
at (60°, 60°, 1 m). The result will be a graph displaying
the average resolution and the standard deviation for every
DOA algorithm. Only one snapshot will be used since we are
limited to only one snapshot in real-life applications. This is a
consequence of the slower speed of sound, to gather multiple
measurements the signal sources should be stationary, which
would limit the possibilities of the 3D sonar application.
The research into the effect of the number of snapshots
to the angular and range resolution will be conducted in a
similar manner as the research for the SNR. The SNR will
be 3 dB to assure a clearly received signal and the number of
snapshots will range from 1 to 15.
Next, we investigate how close two point sources can be
placed while remaining distinguishable. Two sources will be
placed at an elevation of 0° and a distance of 1 m. They will
be centered around 0° azimuth and will start with a spacing
of 40°, moving closer with 2° at a time. The criterion of
distinguishably will be that the maximum attenuation found
in between the two sources should be higher than 3 dB. This
is analogous to the Rayleigh criterion or Rayleigh width in
optics [20]. This experiment will again be repeated 20 times
for the different DOA algorithms with a SNR of 10 dB and
only one snapshot.
Finally, we will end the simulation results with an inves-
tigation into how many sources can be placed at the same
range while still being resolvable. Up to five sources will
be placed at an elevation of 0° and a distance of 1 m. They
will be centered around 0° azimuth with a spacing of 40° in
between, again with a SNR of 10 dB and only one snapshot.
This experiment will be repeated 20 times for the different
DOA algorithms. The output will be the sum of the spatial
images for every range averaged over the 20 repetitions.
We end our results with a real-life imaging run of the
µRTIS. We attached the µRTIS to a Pioneer 3-DX, a small
radio guided robot, that was driven through a narrow corridor
as is visible in Fig. 8. We captured data at 10 Hz and
calculated acoustic images using DAS, Capon, and MUSIC
FBSS.
VOLUME 4, 2016 5
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
-15 -10 -5 0 5 10 15 20
SNR [dB]
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Resolution [sr]
Ang ular res oluti on in f uncti on of SNR, s ource at (0°, 0°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
-15 -10 -5 0 5 10 15 20
SNR [dB]
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Resolution [sr]
Ang ular res oluti on in f uncti on of SNR, s ource at (30°, 30°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
-15 -10 -5 0 5 10 15 20
SNR [dB]
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Resolution [sr]
Ang ular res oluti on in f uncti on of SNR, s ource at (60°, 60°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
FIGURE 6. Angular resolution in function of SNR with five different DOA-algorithms and one snapshot, a source (azimuth, elevation, range) is placed at (0°, 0°, 1
m), (30°, 30°, 1 m), and (60°, 60°, 1 m).
-20 -15 -10 -5 0 5 10 15 20
SNR [dB]
-1
0
1
2
3
4
5
6
7
8
9
10
Range resolution [cm]
Range r esolut ion in funct ion of SNR, sourc e at (0°, 0°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
-20 -15 -10 -5 0 5 10 15 20
SNR [dB]
-1
0
1
2
3
4
5
6
7
8
9
10
Range resolution [cm]
Range r esolut ion in funct ion of SNR, sourc e at (30°, 30°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
-20 -15 -10 -5 0 5 10 15 20
SNR [dB]
-1
0
1
2
3
4
5
6
7
8
9
10
Range resolution [cm]
Range r esolut ion in funct ion of SNR, sourc e at (60°, 60°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
FIGURE 7. Spatial resolution in function of SNR with five different DOA-algorithms and one snapshot, a source (azimuth, elevation, range) is placed at (0°, 0°, 1
m), (30°, 30°, 1 m), and (60°, 60°, 1 m).
FIGURE 8. Setup of the real-life imaging run of the µRTIS. The µRTIS is
attached to a Pioneer 3-DX and was driven through a narrow corridor of
approximately two meters wide at our university.
V. RESULTS AND DISCUSSION
A. SNR
Fig. 6 shows the angular resolution in function of the SNR
for a source placed at (0°, 0°, 1 m), (30°, 30°, 1 m), and (60°,
60°, 1 m). The SNR ranges from -16 dB to 20 dB. It can
be seen that the angular resolution of the MUSIC algorithm
applied to a sonar sensor is very dependent on the SNR, this
is the most clear for the case where the source is placed (60°,
60°, 1 m). However, when the SNR is higher than 0 dB in
the latter case, the resolution appears to be stable. The same
trend can be seen for Capon FBSS, MUSIC FSS, and MUSIC
FBSS, all be it less severe. The angular resolution for these
algorithms is in the worst case around 0.025 sr, compared
to 0.3 sr for the MUSIC algorithm. These algorithms also
become stable around -10 dB. DAS is the worst performer,
the angular resolution for every case is quite stable but the
highest of all algorithms when the SNR increases.
Fig. 6 further shows a deterioration of the angular resolu-
tion when the source is placed at higher angles. This is the
most visible for the DAS algorithm, for which the angular
resolution rises from 0.04 sr when the source is placed at
(0°, 0°, 1 m), to 0.12 sr for a source placed at (60°, 60°, 1
m). This deterioration is an immediate effect of the decreased
aperture of the sonar sensor. The aperture is proportionate to
the cosine, meaning it is halved when the signal originates
from -60° or 60°. The angular resolution of all beamformers
in this situation worsens.
Fig. 7 shows the range resolution in function of the SNR
for a source placed at (0°, 0°, 1 m), (30°, 30°, 1 m), and
(60°, 60°, 1 m). The range resolutions for all algorithms
are quite stable for all cases with an exception for MUSIC
and MUSIC FSS when the source is placed at (0°, 0°, 1
m). For these cases the range resolutions worsens for higher
6VOLUME 4, 2016
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
0 5 10 15
Snapshots [-]
0
0.05
0.1
0.15
Resolution [sr]
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (0°, 0°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Resolution [sr]
10-3
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (0°, 0°, 1 m)
Detail ed view o f Capon an d MUSIC algor ithms
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
0
0.05
0.1
0.15
Resolution [sr]
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (30°, 30°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Resolution [sr]
10-3
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (30°, 30°, 1 m)
Detail ed view o f Capon an d MUSIC algor ithms
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
0
0.05
0.1
0.15
Resolution [sr]
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (60°, 60°, 1 m)
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Resolution [sr]
10-3
Ang ular res oluti on in f uncti on of s napsho ts, sou rce at (60°, 60°, 1 m)
Detail ed view o f Capon an d MUSIC algor ithms
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
FIGURE 9. Angular resolution in function of snapshots with five different DOA-algorithms and a SNR of 3 dB, a source (azimuth, elevation, range) is placed at (0°,
0°, 1 m), (30°, 30°, 1 m), and (60°, 60°, 1 m).
0 5 10 15
Snapshots [-]
-1
0
1
2
3
4
5
6
7
8
Range resolution [cm]
Range r esolut ion in funct ion of snapsh ots, so urce at (0°, 0°, 1 m )
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
-1
0
1
2
3
4
5
6
7
8
Range resolution [cm]
Range r esolut ion in funct ion of snapsh ots, so urce at (30°, 30°, 1 m )
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
0 5 10 15
Snapshots [-]
-1
0
1
2
3
4
5
6
7
8
Range resolution [cm]
Range r esolut ion in funct ion of snapsh ots, so urce at (60°, 60°, 1 m )
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
FIGURE 10. Spatial resolution in function of snapshots with five different DOA-algorithms and a SNR of 3 dB, a source (azimuth, elevation, range) is placed at (0°,
0°, 1 m), (30°, 30°, 1 m), and (60°, 60°, 1 m).
SNRs. Overall the differences in range resolutions between
the tested algorithms is negligible. When the source is not
placed right in front of the sensor all sensors perform quite
alike with overlapping standard deviations that do not exceed
6 cm. Which in most use cases is sufficient.
B. NUMBER OF SNAPSHOTS
Fig. 9 shows the angular resolution is not dependent of the
number of snapshots in a low snapshot scenario except for
the original MUSIC algorithm. This effect is most visible in
the detailed view of the Capon FBSS and MUSIC algorithms,
in particular when the source is placed at (60°, 60°, 1 m).
MUSIC’s mean angular resolution evolves from 0.003 sr
for one snapshot to 0.001 sr for five snapshots whereas the
spatially smoothed algorithms remain stable. This is a very
positive effect since the biggest threshold of using MUSIC
in a sonar application is the difficulty of recording multiple
snapshots. Fig. 9 further shows the immense gain in resolu-
tion compared to DAS. The resolutions for the smaller angles
are all stable, but whereas the spatially smoothed MUSIC
algorithms and Capon FBSS remain under 0.0016 sr, DAS
has a mean resolution of nearly 0.025 sr when the source is
placed at (0°, 0°, 1 m) and 0.125 sr for a source placed at
(60°, 60°, 1 m). The bottom row of Fig. 9 further reveals that
MUSIC FBSS and MUSIC FSS perform slightly better than
Capon FBSS, their angular resolutions are the most stable for
all angles. This is important for 3D sonar applications that
scan the entire frontal hemisphere.
VOLUME 4, 2016 7
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
10 15 20 25 30 35 40
Angle spacing [°]
0
10
20
30
40
50
60
Maximum attenuation [dB]
Maxim um atten uation between t wo sou rces in funct ion of angle sp acing
Delay-And-Sum
Capon FBSS
MUSIC
MUSIC FSS
MUSIC FBSS
3dB line
FIGURE 11. Maximum attenuation between two sources in function of angle
spacing with five different DOA-algorithms. The elevation and range of the
sources is kept constant at 0° and 1 m, while the azimuth spacing varies from
40° to 10°, centered around 0°. A SNR of 10 dB and one snapshot is used.
The range resolution results visible in Fig. 10 also show no
connection between the number of snapshots and the range
resolution. The maximum resolution of about 7 cm is the
consequence of the preprocessing. To avoid calculating every
possible partial spatial spectrum, the time slots get filtered
according to the maximum amplitude detected in the entire
spectrum. When the total amplitude of a time slot is lower
than half the maximum value, its amplitude is neglected.
Overall, DAS and Capon FBSS perform the most stable
for all the cases, whereas it can be seen that the MUSIC
algorithms are very dependent on the location of the source,
improving slightly as the source moves away from the center
of the sensor.
C. SPACING
Fig. 11 shows the maximum attenuation of the acquired
spectrum between two point sources. This provides a good
indication of how close two sources can be placed while still
remaining distinguishable. The 3 dB line will be used as a
limit, when the maximum attenuation is lower than the 3 dB
line it is fair to say it is no longer possible to distinguish
the two sources from each other. MUSIC performs the worst
out of the different algorithms, its maximum attenuation does
not pass 8 dB and at 25° it crosses the 3 dB line. It is even
outperformed by DAS which is able to distinguish the two
sources with a spacing of 24° with attenuations well over
10 dB. However, it is important to mention that this figure
only displays the maximum attenuation between two sources.
In Fig. 12 it can be seen that DAS shows extra peaks in
between the real sources which can wrongly be identified
as incoming signals. This effect can also be seen with the
MUSIC algorithm, all be it less severe.
Fig. 11 further shows the effect of the spatial smoothing,
the spatial smoothed algorithms (Capon FBSS, MUSIC FSS,
and MUSIC FBSS) all perform quite alike. They are able to
clearly distinguish the two sources up until a separation of
14° with a mean attenuation that stays above 10 dB.
D. RESOLVABILITY
Fig. 12 depicts the different spectra for the five used DOA-
algorithms when placing multiple reflectors at the same dis-
tance. The spectrum of DAS resolves all the sources clearly
but in between the real sources, the limited dynamic range
could lead to a wrongly identified source. These artefacts
are not present with the remaining algorithms. For less than
five sources Capon FBSS resolves the sources correctly. The
dynamic range of the spectrum is around 60 dB, making
Capon FBSS have the highest dynamic range along with
MUSIC FBSS. However, when five sources are placed at 1 m
all at the same elevation, Capon and the MUSIC algorithms
fail to correctly resolve the sources. This is a consequence of
the coherency of the sources. Due to the fact that all of the
sources are placed directly in front of the sonar sensor at an
elevation of 0° our rectangular array will act as a linear one.
In that case we only have four subarrays in the plane with
an elevation of 0° and are capable of resolving up to four
sources [12]. This results in the sources not being resolved
correctly. Both Capon and the MUSIC algorithms seem to
locate only four or three sources respectively, most of them
located at the wrong locations. The same effect can be seen
with DAS, where the Rayleigh criterion is no longer satisfied
merging the responses of the four outermost sources into
two erroneous responses. This is not a big issue in real-life
performance since it is very rare to have five or more coherent
sources falling within the Rayleigh width (approximately 6
mm in Fig. 4) of the cross correlation function of the emitted
chirp.
MUSIC performs well except for a very limited dynamic
range when there are more than one source present. MUSIC
FSS and MUSIC FBSS perform alike. However, MUSIC
FBSS performs better when less than five sources are present,
for these situations the resolution and dynamic range of
MUSIC FBSS are better.
As previously mentioned in Section V-A, the aperture of
the sensor dramatically worsens for angles higher than 60°.
This is visible when four sources are placed. In this case, the
PSFs of the sources at -60° and 60° are wider than those of
the other sources. The resolved sources in this situation also
appear to have a slight offset that worsens for sources placed
at higher angles as an effect of the decreased aperture of the
sensor relative to these sources, which was also discussed in
Section V-A.
In Section V-B it was seen that Capon FBSS had one
of the lowest angular resolutions when there is one source
present, whereas MUSIC FBSS performed the worst. How-
ever, Fig. 12 shows that the MUSIC algorithms, especially
MUSIC FBSS, have lower angular resolutions when more
than one source is present.
E. REAL-LIFE IMAGING RESULTS
The µRTIS was attached to a small radio guided robot and
driven through a narrow corridor at the university of about 2
m wide. The data was captured at 10 Hz and the acoustic
images were calculated using DAS, Capon, and MUSIC
8VOLUME 4, 2016
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
2 sources
3 sources
4 sources
5 sources
1 source
Azimuth [°]
-80 -60 -40 -20 0 20 40 60 80
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
Spatial spectrum of multiple sources: MUSIC FSS
2 sources
3 sources
4 sources
5 sources
1 source
Azimuth [°]
-80 -60 -40 -20 0 20 40 60 80
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
Spatial spectrum of multiple sources: MUSIC
2 sources
3 sources
4 sources
5 sources
1 source
Azimuth [°]
-80 -60 -40 -20 0 20 40 60 80
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
Spatial spectrum of multiple sources: Delay and Sum
2 sources
3 sources
4 sources
5 sources
1 source
Azimuth [°]
-80 -60 -40 -20 0 20 40 60 80
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
Spatial spectrum of multiple sources: Capon
2 sources
3 sources
4 sources
5 sources
1 source
Azimuth [°]
-80 -60 -40 -20 0 20 40 60 80
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
10
0
-10
Elevation [°]
Spatial spectrum of multiple sources: MUSIC FBSS
Amplitude [dB]
30
25
20
15
10
5
0
Amplitude [dB]
40
35
30
25
20
15
10
5
0
Amplitude [dB]
50
40
30
20
10
0
Amplitude [dB]
50
45
40
35
30
25
20
15
10
5
0
Amplitude [dB]
50
40
30
20
10
0
FIGURE 12. Spectra of five different DOA-algorithms with multiple sources centered around 0° azimuth with a spacing of 40°. The sources are placed at a distance
of 1 m and a SNR of 10 dB. In the case of one source, the source is placed at 0°azimuth. When two sources are present, the sources are placed at -20° and 20°
azimuth. In the case of three sources, the sources are placed at -40°, 0°, and 40° azimuth. When four sources are present, the sources are placed at -60°, -20°, 20°,
and 60° azimuth. In the final case of five sources, the sources are placed at -80°, -40°, 0°, 40°, and 80° azimuth. The real locations of the sources are indicated by
vertical red lines.
μRTIS real life corridor imaging run - 10.8 s
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
Delay and Sum
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
Capon
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
MUSIC FBSS
FIGURE 13. Real-life imaging results of the µRTIS. The recorded data was processed using Delay-And-Sum, Capon, and MUSIC FBSS. The µRTIS was attached
to a small radio guided robot and driven through a narrow corridor. This figure is a snapshot at 10.8 s, the full run is available at [28].
μRTIS real life corridor imaging run - 38.3 s
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
Delay and Sum
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
Capon
X-axis [m]
-6 -4 -2 0 2 4 6
Y-axis [m]
6
5
4
3
2
1
0
40
30
20
10
0
Amplitude [dB]
MUSIC FBSS
FIGURE 14. Real-life imaging results of the µRTIS. The recorded data was processed using Delay-And-Sum, Capon, and MUSIC FBSS. The µRTIS was attached
to a small radio guided robot and driven through a narrow corridor. This figure is a snapshot at 38.3 s, the full run is available at [28].
VOLUME 4, 2016 9
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
FBSS, using the processing scheme described in 1. A video
of the entire dataset is available at [28]. Two snapshots from
this video were taken to discuss the differences between the
acquired results. Fig. 13 and Fig. 14 are snapshots taken at
10.8 s and 38.3 s respectively. In each figure a gray ellipse
was drawn around a reflection for reference.
Fig. 13 shows a clear improvement in the imaging results
of Capon and MUSIC FBSS compared to DAS. The point
spread functions of the solved reflections are much narrower.
The green lines are drawn every 15°, meaning the PSFs of
the imaging results of DAS span over an area of about 30°
compared to about 7.5° when Capon or MUSIC FBSS is
used. All though the PSFs are narrower, the dynamic range
slightly worsens, it can be seen that Capon and MUSIC FBSS
solve the sources with a maximum peak that is lower than 40
dB, which is the maximum reached with DAS.
Fig. 14 on the other hand shows a clear difference between
Capon and MUSIC FBSS. In this figure four reflections are
indicated which are clearly resolved using DAS and MUSIC
FBSS. However, Capon only resolves two of them and with
a magnitude which is very low compared to the other two
algorithms. Since these reflections are at the outer rim of the
sensor, centered at around -60°, it can also be seen the PSFs
are wider compared to the results in Fig. 13. The same effect
was also previously discussed during the simulations.
VI. CONCLUSION
The simulation results revealed one of the biggest bottle-
necks of sonar sensors: the need for a high dynamic range.
Reflectors that are placed at higher angles suffer from harsh
attenuation of the signal strength due to decrease in aperture
of the sensor. Sources at a distance of 1 m are easily solved
with SNR higher than 0 dB.
Unexpectedly, the SNR and the number of snapshots no
longer have a high influence to the angular and spatial
resolution of the different MUSIC algorithms. This effect
is especially true for the number of snapshots when spatial
smoothing is applied, which is a very positive observation
since we are limited in the number of snapshots to one due to
time constraints in our measurement process.
From the different algorithms that were compared, MUSIC
FSS and MUSIC FBSS have the best angular and spatial
resolution when the sources are scattered (not placed at (0°,
0°, 1 m)). Their angular resolution also stays stable when
the location of the incoming signal changes between 0°
and 60°. In contrast to DAS which always has an angular
resolution that is up to almost 100 times greater than the other
algorithms. The angular resolution of MUSIC and Capon
FBSS in function of the SNR behave quite alike, with Capon
FBSS behaving the best of both.
The range resolutions of DAS, Capon FBSS, and MUSIC
FBSS all behave similarly for the different sources, in most
cases they do not pass 6 cm which is fine for most 3D sonar
applications.
Spatial smoothing should be used when it is necessary to
solve multiple reflectors that are placed closely together, the
spatially smoothed algorithms outperform DAS and MUSIC
with more than 10°. However, it should be noticed that
Capon FBSS does show some artefacts that could be wrongly
identified as sources.
Overall, the spatially smoothed algorithms performed the
most stable in the presence of one source. The angular
resolutions are the lowest and most stable of the compared
algorithms. The great advantage of Capon is that the magni-
tude information of the incoming sources is preserved. If this
information is important for the application, a form of Capon
should be used. However, it is important to use a sufficient
spatial smoothing scheme to suppress the artefacts that are
visible when multiple sources are present. In these cases the
spatially smoothed MUSIC algorithms perform better.
Another factor of the MUSIC algorithm that not appears
to be an issue in our sonar application, is the coherency of
the incoming signals when spatial smoothing is not used.
When less than five sources are placed at the same distance
and therefore all the signals coherent, the original MUSIC
algorithm is still able to resolve the different DOAs. It is
of importance to notice that MUSIC FSS and MUSIC FBSS
do have a higher dynamic range and better resolution in this
situation.
These results prove the feasibility of the µRTIS in a
simulated environment and this was further illustrated with
a real-world system. The real-life imaging results confirmed
some of the simulation results such as the ability of MUSIC
and Capon to greatly improve the imaging results. It further
showed that DAS and MUSIC were able to correctly identify
all the sources that were present whereas Capon only iden-
tified half of them in a certain situation. These sources that
were resolved using Capon also had a lower amplitude and
were barely visible in the imaging results. This effect was
also visible in the simulation results when multiple sources
were placed at the outer rim of the sensor.
REFERENCES
[1] A. Filgueira, H. González-Jorge, S. Lagüela, L. Díaz-Vilariño, and
P. Arias, “Quantifying the influence of rain in LiDAR performance,
Measurement, vol. 95, pp. 143–148, jan 2017. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0263224116305577
[2] T. G. Phillips, N. Guenther, and P. R. McAree, “When the Dust
Settles: The Four Behaviors of LiDAR in the Presence of Fine Airborne
Particulates,” Journal of Field Robotics, vol. 34, no. 5, pp. 985–1009, aug
2017. [Online]. Available: http://doi.wiley.com/10.1002/rob.21701
[3] B. Siciliano and O. Khatib, Springer Handbook of Robotics. Berlin,
Heidelberg: Springer Berlin Heidelberg, 2008. [Online]. Available:
http://link.springer.com/10.1007/978-3-540-30301-5
[4] R. Kerstens, D. Laurijssen, and J. Steckel, “Low-cost one-bit MEMS
microphone arrays for in-air acoustic imaging using FPGA’s,” in
Proceedings of IEEE Sensors, vol. 2017-Decem. IEEE, oct 2017, pp.
1–3. [Online]. Available: http://ieeexplore.ieee.org/document/8234087/
[5] ——, “ERTIS: A fully embedded real time 3d imaging sonar sensor for
robotic applications,” in Proceedings - IEEE International Conference on
Robotics and Automation, vol. 2019-May. IEEE, may 2019, pp. 1438–
1443. [Online]. Available: https://ieeexplore.ieee.org/document/8794419/
[6] J. Steckel, A. Boen, and H. Peremans, “Broadband 3-D sonar system
using a sparse array for indoor navigation,” IEEE Transactions on
Robotics, vol. 29, no. 1, pp. 161–171, feb 2013. [Online]. Available:
http://ieeexplore.ieee.org/document/6331017/
[7] T. Verellen, R. Kerstens, D. Laurijssen, and J. Steckel, “Urtis: a Small
3d Imaging Sonar Sensor for Robotic Applications,” in ICASSP 2020 -
10 VOLUME 4, 2016
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
2020 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP). IEEE, may 2020, pp. 4801–4805. [Online].
Available: https://ieeexplore.ieee.org/document/9053536/
[8] R. O. Schmidt, “Multiple Emitter Location and Signal Parameter
Estimation.” IEEE Transactions on Antennas and Propagation,
vol. AP-34, no. 3, pp. 276–280, mar 1986. [Online]. Available:
http://ieeexplore.ieee.org/document/1143830/
[9] J. H. Cozzens and M. J. Sousa, “Source Enumeration in a
Correlated Signal Environment,IEEE Transactions on Signal
Processing, vol. 42, no. 2, pp. 304–317, 1994. [Online]. Available:
http://ieeexplore.ieee.org/document/275604/
[10] H. Krim and J. G. Proakis, “Smoothed eigenspace-based parameter
estimation,” Automatica, vol. 30, no. 1, pp. 27–38, jan 1994. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/0005109894902267
[11] D. B. Williams, “Detection: Determining the number of sources,” in The
Digital Signal Processing Handbook, Second Edition: The Digital Signal
Processing Handbook, Second Edition: Wireless, Networking, Radar, Sen-
sor Array Processing, and Nonlinear Signal Processing. CRC Press,
Boca Raton, FL, 2009, pp. 8–1–8–10.
[12] T. J. Shan, M. Wax, and T. Kailath, “On Spatial Smoothing for Direction-
of-Arrival Estimation of Coherent Signals,IEEE Transactions on
Acoustics, Speech, and Signal Processing, vol. 33, no. 4, pp. 806–811, aug
1985. [Online]. Available: http://ieeexplore.ieee.org/document/1164649/
[13] M. Wang and B. Xiao, “Separation of cogerent multi-path
signals with improved MUSIC algorithm,” in Proceedings - 2011
International Conference on Computational and Information Sciences,
ICCIS 2011. IEEE, oct 2011, pp. 992–995. [Online]. Available:
http://ieeexplore.ieee.org/document/6086369/
[14] Q. Chen and R. Liu, “On the explanation of spatial
smoothing in MUSIC algorithm for coherent sources,” in 2011
International Conference on Information Science and Technology,
ICIST 2011. IEEE, mar 2011, pp. 699–702. [Online]. Available:
http://ieeexplore.ieee.org/document/5765342/
[15] C. Degen, “On single snapshot direction-of-arrival estimation,”
in 2017 IEEE International Conference on Wireless for Space
and Extreme Environments, WiSEE 2017. IEEE, 2017, pp. 92–
97. [Online]. Available: https://ieeexplore.ieee.org/document/8124899
http://ieeexplore.ieee.org/document/8124899/
[16] L. C. Godara, “Application of antenna arrays to mobile communications,
part II: Beam-forming and direction-of-arrival considerations,
Proceedings of the IEEE, vol. 85, no. 8, pp. 1195–1245, 1997.
[Online]. Available: http://ieeexplore.ieee.org/document/622504/
[17] R. Kerstens, T. Verellen, and J. Steckel, “Comparing adaptations of MU-
SIC beamforming for 3D in-air sonar applications with coherent source-
spaces,” in Proceedings on CD of the 8th Berlin Beamforming Conference,
March 2-3, 2020, mar 2020.
[18] F. A. Everest and K. C. Pohlmann, Master Handbook of Acoustics, 5th ed.
New York: McGraw-Hill, 2009.
[19] C. Knapp and G. Carter, “The generalized correlation method for
estimation of time delay,” IEEE Transactions on Acoustics, Speech, and
Signal Processing, vol. 24, no. 4, pp. 320–327, aug 1976. [Online].
Available: http://ieeexplore.ieee.org/document/1162830/
[20] W. M. Saslow, Electricity, Magnetism, and
Light. Elsevier, 2002. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/B9780126194555X50001
[21] H. L. Van Trees, Optimum Array Processing. New York,
USA: John Wiley & Sons, Inc., mar 2002. [Online]. Available:
http://doi.wiley.com/10.1002/0471221104
[22] F. Vincent and O. Besson, “Steering vector uncertainties and
diagonal loading,” in 2004 Sensor Array and Multichannel Signal
Processing Workshop. IEEE, 2004, pp. 327–331. [Online]. Available:
http://ieeexplore.ieee.org/document/1502963/
[23] L. B. Fertig, “Statistical performance of the MVDR beamformer
in the presence of diagonal loading,” in Proceedings of the
IEEE Sensor Array and Multichannel Signal Processing Workshop,
vol. 2000-Janua. IEEE, 2000, pp. 77–81. [Online]. Available:
http://ieeexplore.ieee.org/document/877972/
[24] J. Wen, B. Liao, and C. Guo, “Spatial smoothing based methods for
direction-of-arrival estimation of coherent signals in nonuniform noise,
Digital Signal Processing: A Review Journal, vol. 67, pp. 116–122, 2017.
[25] Y. Yang, C. Wan, C. Sun, and Q. Wang, “DOA estimation for coherent
sources in beamspace using spatial smoothing,” in ICICS-PCM 2003
- Proceedings of the 2003 Joint Conference of the 4th International
Conference on Information, Communications and Signal Processing and
4th Pacific-Rim Conference on Multimedia, vol. 2. IEEE, 2003, pp. 1028–
1032. [Online]. Available: http://ieeexplore.ieee.org/document/1292615/
[26] B. Widrow, K. M. Duvall, R. P. Gooch, and W. C. Newman,
“Signal Cancellation Phenomena in Adaptive Antennas: Causes
and Cures,” IEEE Transactions on Antennas and Propagation,
vol. 30, no. 3, pp. 469–478, may 1982. [Online]. Available:
http://ieeexplore.ieee.org/document/1142804/
[27] Y. Huiyue and Z. Xilang, “On 2-D forward-backward spatial
smoothing for azimuth and elevation estimation of coherent signals,
in IEEE Antennas and Propagation Society, AP-S International
Symposium (Digest), vol. 2 B, no. August. IEEE, 2005, pp. 80–
83. [Online]. Available: https://ieeexplore.ieee.org/document/1551940
http://ieeexplore.ieee.org/document/1551940/
[28] J. Steckel, “https://www.youtube.com/channel/
UC9MzqwVf0ZFmt19lJMAgVxQ,” 2019. [Online]. Available:
https://www.youtube.com/channel/UC9MzqwVf0ZFmt19lJMAgVxQ
THOMAS VERELLEN received the M.Sc. de-
gree in Electronics and ICT Engineering Technol-
ogy from the University of Antwerp, Belgium, in
2019. Following his internship during his master’s
program at the Constrained Systems Lab from
the University of Antwerp he started pursuing
the Ph.D. degree there. As a member of Flanders
Make Strategic Research Centre he researches the
adaptation of ultrasound in the field of condition
monitoring and predictive maintenance.
ROBIN KERSTENS received his B.Sc. degree
in Applied Engineering Electronics and ICT from
the University of Antwerp (Antwerp, Belgium) in
2014 for a bachelor thesis focussed on RADAR
signal processing. This was followed by a M.Sc.
Eng. in Electronics and ICT Engineering Tech-
nology, also at the University of Antwerp but this
time with a thesis focussed on sonar beamforming.
After a short spell in the automotive industry de-
signing pressure sensors, Mr. Kerstens is currently
pursuing a Ph.D. degree on Advanced Array Signal Processing for In-Air
sonar. His research interest includes signal processing, wave propagation,
acoustics, machine learning and algorithm development.
VOLUME 4, 2016 11
Thomas Verellen et al.: High-Resolution Ultrasound Sensing for Robotics using Dense Microphone Arrays
JAN STECKEL graduated in electronic engineer-
ing at the University College "Karel de Grote"
in Hoboken in 2007. In 2012 he obtained his
doctoral degree at the University of Antwerp at
the Active Perception Lab, with a dissertation
titled "Array processing for in-air sonar systems
- drawing inspirations from biology". During this
period, he developed state-of-the art sonar sensors,
both biomimetic and sensor-array based. During
his post-doc period he was an active member of
the Centre for Care Technology at the University of Antwerp where he
was in charge of various healthcare-related projects concerning novel sensor
technologies. Furthermore, he pursued industrial exploitation of the patented
3D array sonar sensor which was developed in collaboration during his PhD.
In 2015 he became a tenure track professor at the University of Antwerp
in the Constrained Systems Lab where he researches sensors, sensor arrays
and signal processing algorithms using an embedded, constrained systems
approach.
12 VOLUME 4, 2016
... On the contrary, other types of sensors, such as radar or sonar, with their electromagnetic and acoustic sensor modalities, respectively, continue to achieve precise and accurate results, even in rough conditions. Therefore, in-air acoustic sensing with sonar microphone arrays has become an active research topic in the last decade [3][4][5][6][7]. This sensing modality can perform well in indoor industrial environments with rough conditions, generate spatial information from that environment, and be used for autonomous activities such as navigation or simultaneous localisation and mapping (SLAM). ...
Preprint
Full-text available
Navigation in varied and dynamic indoor environments remains a complex task for autonomous mobile platforms. Especially when conditions worsen, typical sensor modalities may fail to operate optimally and subsequently provide inapt input for safe navigation control. In this study, we present an approach for the navigation of a dynamic indoor environment with a mobile platform with a single or several sonar sensors using a layered control system. These sensors can operate in conditions such as rain, fog, dust, or dirt. The different control layers, such as collision avoidance and corridor following behavior, are activated based on acoustic flow queues in the fusion of the sonar images. The novelty of this work is allowing these sensors to be freely positioned on the mobile platform and providing the framework for designing the optimal navigational outcome based on a zoning system around the mobile platform. Presented in this paper is the acoustic flow model used, as well as the design of the layered controller. Next to validation in simulation, an implementation is presented and validated in a real office environment using a real mobile platform with one, two, or three sonar sensors in real time with 2D navigation. Multiple sensor layouts were validated in both the simulation and real experiments to demonstrate that the modular approach for the controller and sensor fusion works optimally. The results of this work show stable and safe navigation of indoor environments with dynamic objects.
... On the contrary, other types of sensors, such as radar or sonar, with their electromagnetic and acoustic sensor modalities, respectively, continue to achieve precise and accurate results, even in rough conditions. Therefore, in-air acoustic sensing with sonar microphone arrays has become an active research topic in the last decade [3][4][5][6][7]. This sensing modality can perform well in indoor industrial environments with rough conditions, generate spatial information from that environment, and be used for autonomous activities such as navigation or simultaneous localisation and mapping (SLAM). ...
Article
Full-text available
Navigation in varied and dynamic indoor environments remains a complex task for autonomous mobile platforms. Especially when conditions worsen, typical sensor modalities may fail to operate optimally and subsequently provide inapt input for safe navigation control. In this study, we present an approach for the navigation of a dynamic indoor environment with a mobile platform with a single or several sonar sensors using a layered control system. These sensors can operate in conditions such as rain, fog, dust, or dirt. The different control layers, such as collision avoidance and corridor following behavior, are activated based on acoustic flow queues in the fusion of the sonar images. The novelty of this work is allowing these sensors to be freely positioned on the mobile platform and providing the framework for designing the optimal navigational outcome based on a zoning system around the mobile platform. Presented in this paper is the acoustic flow model used, as well as the design of the layered controller. Next to validation in simulation, an implementation is presented and validated in a real office environment using a real mobile platform with one, two, or three sonar sensors in real time with 2D navigation. Multiple sensor layouts were validated in both the simulation and real experiments to demonstrate that the modular approach for the controller and sensor fusion works optimally. The results of this work show stable and safe navigation of indoor environments with dynamic objects.
... Ultrasonic time-of-flight (ToF) measurements are a common technique in many research and industrial fields, spanning from ranging applications [1][2][3][4] to human-computer interaction [5,6] to non-destructive testing (NDT) of materials [7][8][9][10][11][12][13][14][15]. The basic concept of ultrasonic ToF measurements is that a signal is transmitted from an ultrasonic transducer and received at a later time by the same or a different transducer. ...
Article
Full-text available
Ultrasonic time-of-flight (ToF) measurements enable the non-destructive characterization of material parameters as well as the reconstruction of scatterers inside a specimen. The time-consuming and potentially damaging procedure of applying a liquid couplant between specimen and transducer can be avoided by using air-coupled ultrasound. However, to obtain accurate ToF results, the waveform and travel time of the acoustic signal through the air, which are influenced by the ambient conditions, need to be considered. The placement of microphones as signal receivers is restricted to locations where they do not affect the sound field. This study presents a novel method for in-air ranging and ToF determination that is non-invasive and robust to changing ambient conditions or waveform variations. The in-air travel time was determined by utilizing the azimuthal directivity of a laser Doppler vibrometer operated in refracto-vibrometry (RV) mode. The time of entry of the acoustic signal was determined using the autocorrelation of the RV signal. The same signal was further used as a reference for determining the ToF through the specimen in transmission mode via cross-correlation. The derived signal processing procedure was verified in experiments on a polyamide specimen. Here, a ranging accuracy of < 0.1 mm and a transmission ToF accuracy of 0.3 �s were achieved. Thus, the proposed method enables fast and accurate non-invasive ToF measurements that do not require knowledge about transducer characteristics or ambient conditions.
... An example of a combination with capacitive sensors to overcome detection limitations at short distances is provided in [101]. The group of Prof. Steckel at University Antwerp has a strong focus on 3D A-US for robotic applications, e. g. [102], [103]. The group has achieved remarkable results in areas such as SLAM for the navigation of mobile platforms. ...
Article
Proximity perception is a technology that has the potential to play an essential role in the future of robotics. It can fulfill the promise of safe, robust, and autonomous systems in industry and everyday life, alongside humans, as well as in remote locations in space and underwater. In this survey article, we cover the developments of this field from the early days up to the present, with a focus on human-centered robotics. In this domain, proximity sensors are typically deployed in two scenarios: first, on the exterior of manipulator arms to support safety and interaction functionality, and second, on the inside of grippers or hands to support grasping and exploration. Therefore, based on this observation, in the beginning of this article, we propose a categorization to organize the use cases of proximity sensors in human-centered robotics. Then, we devote effort to present the sensing technologies and different measuring principles that have been developed over the years, also providing a summary in form of a table. Following, we review the literature regarding the applications that have been proposed. Finally, we give an overview of the most important trends that will shape the future of this domain.
... Ultrasonic measurements are an essential tool in many fields of science, ranging from medical diagnostics [1] and autonomous vehicle positioning [2] to non-destructive testing (NDT) of materials. In NDT, ultrasonic testing (UT) is a common measurement technique used to obtain information about material parameters, structural information, or to detect flaws [3]. ...
Article
Full-text available
Air-coupled ultrasonic (ACU) testing has proven to be a valuable method for increasing the speed in non-destructive ultrasonic testing and the investigation of sensitive specimens. A major obstacle to implementing ACU methods is the significant signal power loss at the air-specimen and transducer-air interfaces. The loss between transducer and air can be eliminated by using recently developed fluidic transducers. These transducers use pressurized air and a natural flow instability to generate high sound power signals. Due to this self-excited flow instability, the individual pulses are dissimilar in length, amplitude, and phase. These amplitude and angle modulated pulses offer the great opportunity to further increase the signal-to-noise ratio with pulse compression methods. In practice, multi-input multi-output (MIMO) setups reduce the time required to scan the specimen surface, but demand high pulse discriminability. By applying envelope removal techniques to the individual pulses, the pulse discriminability is increased allowing only the remaining phase information to be targeted for analysis. Finally, semi-synthetic experiments are presented to verify the applicability of the envelope removal method and highlight the suitability of the fluidic transducer for MIMO setups.
Chapter
Robot perception is the ability of a robotic platform to perceive its environment by the means of sensor inputs, e.g., laser, IMU, motor encoders, and so on. Much like humans, robots are not limited to perceiving their environment through vision-based sensors, e.g., cameras. Robot perception, through the scope of this chapter, encompasses acoustic signal processing techniques to locate the presence of a sound source, e.g., human speaker, within an environment for human-robot interaction (HRI), that has gained great interest within scientific community. This chapter will serve as an introduction to acoustic signal processing within robotics, starting with passive acoustic localization and building up to contemporary active sensing methods, such as the usage of neural networks and spatial map generation. The origins of active acoustic localization, which finds its roots in biomimetics, are also discussed.
Article
Full-text available
Sparse spiral phased arrays are advantageous for many emerging air-coupled ultrasonic applications, since grating lobes are prevented without being constrained to the half-wavelength element spacing requirement of well-known dense arrays. As a result, the limitation on the maximum transducer diameter is omitted and the aperture can be enlarged for improving the beamforming precision without requiring the number of transducers to be increased. We demonstrate that in-air imaging, in particular, benefits from these features, enabling large-volume, unambiguous and high-resolution image formation. Therefore, we created an air-coupled ultrasonic phased array based on the Fermat spiral, capable of transmit, receive and pulse-echo operation, as well as 3D imaging. The array consists of 64 piezo-electric 40-kHz transducers (Murata MA40S4S), spanning an aperture of 200 mm. First, we provide an application-independent numerical and experimental characterization of the conventional beamforming performance of all operation modes for varying focal directions and distances. Second, we examine the resulting imaging capabilities using the single line transmission technique. Apart from the high maximum sound pressure level of 152 dB, we validate that unambiguous high-accuracy 3D imaging is possible in a wide field of view (±80 •), long range (20 cm to 5 m+) and with a high angular resolution of up to 2.3 •. Additionally, we demonstrate that object shapes and patterns of multiple reflectors are recognizable in the images generated using a simple threshold for separation. In total, the imaging capabilities achieved are promising to open up further possibilities, e.g. robust object classification in harsh environments based on ultrasonic images.
Conference Paper
Full-text available
State-of-the-art autonomous vehicles mainly rely on optical sensors to perceive their environment. However, the performance of these sensors worsens dramatically in environments where airborne particles are present. Sonar sensors rely on acoustic waves which are able to pass through these distortions. The data gathered from sonar could complement the distorted data of the optical sensors in these earlier mentioned environments. In this paper we will discuss the newest 3D in-air sonar sensor developed by CoSys-Lab: the micro Real Time Imaging Sonar (URTIS). It is the smaller version of the previously developed embedded Real Time Imaging Sonar (eRTIS) and consists of a uniform rectangular array of 5 by 6 microphones compared to the pseudo-random 32 microphone array based on Poisson Disc-Sampling used with the eRTIS. In this paper we will discuss the used hardware of the URTIS, followed by the implemented processing algorithms to form acoustic images from reflected ultrasonic data. Furthermore, we will discuss the spatial resolution that can be obtained with the presented hardware architecture using different beamforming algorithms. Finally, we will validate the real-life performance, comparing the acoustic images of the URTIS when using Delay-And-Sum or MUltiple SIgnal Classification to solve the different directions of arrival.
Conference Paper
The sensors that autonomous robots and vehicles use to perceive their environment are mostly based on optical techniques such as laser, 3D cameras or similar alternatives. The downside of these sensors is that they fail to operate in obstructed environments that are occluded by mist, dust or small clutter. Ultrasonic sensors overcome this issue by using sound, allowing for long-wavelength sensing which passes through the medium’s distortions. The downside of the current ultrasonic sensors is that the angular resolution is worse than that of their optical alternatives. Therefore this paper will explore ways to improve the angular resolution of in-air ultrasonic sensors using the MUltiple SIgnal Classification (MUSIC) algorithm, as it is showing good results in other fields. A limiting factor of the standard MUSIC-algorithm in this case is that its accuracy-performance suffers in the proximity of coherent sources and where not many snapshots of the environment can be taken, which will happen often when using in-air pulse-echo sensing. In this paper we look at several techniques to conquer these problems. These techniques include spatial smoothing, compressive sensing and exploiting the toeplitz matrix theory to reconstruct the full rank-matrix which is necessary to handle coherent sources. There will also be a small explanation on how to use the MUSIC algorithm with an extremely low number of snapshots, which is also necessary when using in-air sonar.
Chapter
The processing of signals received by sensor arrays generally can be separated into two problems: (1) detecting the number of sources and (2) isolating and analyzing the signal produced by each source. We make this distinction because many of the algorithms for separating and processing array signals make the assumption that the number of sources is known a priori and may give misleading results if the wrong number of sources is used [3]. A good example are the errors produced by many high resolution bearing estimation algorithms (e.g., MUSIC) when the wrong number of sources is assumed. Because, in general, it is easier to determine how many signals are present than to estimate the bearings of those signals, signal detection algorithms typically can correctly determine the number of signals present even when bearing estimation algorithms cannot resolve them. In fact, the capability of an array to resolve two closely spaced sources could be said to be limited by its ability to detect that there are actually two sources present. If we have a reliable method of determining the number of sources, not only can we correctly use high resolution bearing estimation algorithms, but we can also use this knowledge to utilize more effectively the information obtained from the bearing estimation algorithms. If the bearing estimation algorithm gives fewer source directions than we know there are sources, then we know that there is more than one source in at least one of those directions and have thus essentially increased the resolution of the algorithm. If analysis of the information provided by the bearing estimation algorithm indicates more source directions than we know there are sources, thenwe can safely assume that some of the directions are the results of false alarms andmay be ignored, thus decreasing the probability of false alarm for the bearing estimation algorithms. In this section we will present and discuss the more common approaches to determining the number of sources. © 2010 by Taylor and Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business.
Article
Spatial smoothing techniques have been widely used to estimate the directions-of-arrival (DOAs) of coherent signals. However, in general these techniques are derived under the condition of uniform white noise and, therefore, their performance may be significantly deteriorated when nonuniform noise occurs. This motivates us to develop new methods for DOA estimation of coherent signals in nonuniform noise in this paper. In our methods, the noise covariance matrix is first directly or iteratively calculated from the array covariance matrix. Then, the noise component in the array covariance matrix is eliminated to achieve a noise-free array covariance matrix. By mitigating the effect of noise nonuniformity, conventional spatial smoothing techniques developed for uniform white noise can thus be employed to reconstruct a full-rank signal covariance matrix, which enables us to apply the subspace-based DOA estimation methods effectively. Simulation results demonstrate the effectiveness of the proposed methods.
Article
This paper describes the behavior of a commercial light detection and ranging (LiDAR) sensor in the presence of dust. This work is motivated by the need to develop perception systems that must operate where dust is present. This paper shows that the behavior of measurements from the sensor is systematic and predictable. LiDAR sensors exhibit four behaviors that are articulated and understood from the perspective of the shape-of-return signals from emitted light pulses. We subject the commercial sensor to a series of tests that measure the return pulses and show that they are consistent with theoretical predictions of behavior. Several important conclusions emerge: (i) where LiDAR measures dust, it does so to the leading edge of a dust cloud rather than as a random noise; (ii) dust starts to affect measurements when the atmospheric transmittance is less than 71%–74%, but this is quite variable with conditions; (iii) LiDAR is capable of ranging to a target in dust clouds with transmittance as low as 2% if the target is retroreflective and 6% if it is of low reflectivity; (iv) the effects of airborne particulates such as dust are less evident in the far field. The significance of this paper lies in providing insight into how better to use measurements from off-the-shelf LiDAR sensors in solving perception problems.
Article
LiDAR systems emerge as one of the key systems for autonomous vehicles. The present work quantifies the influence of rain in different LiDAR parameters: range, intensity, and number of detected points. Six areas with different target materials are used for the study. Range measurements appear stable even under important rain affectation. The variations are always lower than 20 cm. These variations come from the experimental procedure (averaging of points detected from a surface) and not from the instabilities in the LiDAR detection with rain. The detected LiDAR intensity and the sampled points attenuate with the increasing of rain intensity. Drop size distribution is assumed constant along the study area. The highest decrease in the number of points appears for pavement. However, the intensity returned from pavement is not specially influenced by rain. The rest of the materials show similar trend in the intensity and the number of detected points.
Conference Paper
MUSIC algorithm is an effective method for direction of arrival(DOA), but it can only do with incoherent signals. There are coherent signal and related signal in the actual communication environment. If the condition does not meet, there will be bias occured and even failure in the use of MUSIC algorithm for signal DOA estimation. In order to solve the problem of the DOA estimation of coherent signals, an improved algorithm is presented in this paper. By processing the convariance matrix of the array output signal, the rank of the signal covariance is restored to Rank(Rx)= D, which can effectively estimate the signal DOA and identify the coherent signal source. Simulation results show that the DOA can be correctly estimationed when the signal interval is relatively small and the signal to noise ratio is small. This indicates that the algorithm is effective and has practical value in engineering.