Conference PaperPDF Available

Using Microphone Arrays to Reconstruct Moving Sound Sources for Auralization

Authors:

Abstract and Figures

Microphone arrays are widely used for sound source characterization as well as for moving sound sources. Beamforming is one of the post processing methods to localize sound sources based on microphone array in order to create a color map (the so-called " acoustic camera "). The beamformer response lies on the array pattern, which is influenced by the array shape. Irregular arrays are able to avoid the spatial aliasing which causes grating lobes and degrades array performance to find the spatial positions of sources. With precise characteristics from the beamformer output, the sources can be reconstructed regarding not only spatial distribution but also spectra. Therefore, spectral modeling methods, e.g. spectral modeling synthesis (SMS) can be combined to the previous results to obtain source signals for auralization. In this paper, we design a spiral microphone array to obtain a specific frequency range and resolution. Besides, an unequal-spacing rectangular array is developed as well to compare the performance with the spiral array. Since the second array is separable, Kronecker Array Transform (KAT) can be used to accelerate the beamforming calculation. The beamforming output can be optimized by using deconvolution approach to remove the array response function which is convolved with source signals. With the reconstructed source spectrum generated from the deconvolved beamforming output, the source signal is synthesized separately from tonal and broadband components.
Content may be subject to copyright.
USING MICROPHONE ARRAYS TO RECONSTRUCT MOVING
SOUND SOURCES FOR AURALIZATION
Fanyu Meng, Michael Vorlaender
Institute of Technical Acoustics, RWTH Aachen University, Germany
{fanyu.meng@akustik.rwth-aachen.de)
Abstract
Microphone arrays are widely used for sound source characterization as well as for moving sound
sources. Beamforming is one of the post processing methods to localize sound sources based on
microphone array in order to create a color map (the so-called “acoustic camera”). The beamformer
response lies on the array pattern, which is influenced by the array shape. Irregular arrays are able to
avoid the spatial aliasing which causes grating lobes and degrades array performance to find the
spatial positions of sources. With precise characteristics from the beamformer output, the sources can
be reconstructed regarding not only spatial distribution but also spectra. Therefore, spectral modeling
methods, e.g. spectral modeling synthesis (SMS) can be combined to the previous results to obtain
source signals for auralization.
In this paper, we design a spiral microphone array to obtain a specific frequency range and resolution.
Besides, an unequal-spacing rectangular array is developed as well to compare the performance with
the spiral array. Since the second array is separable, Kronecker Array Transform (KAT) can be used to
accelerate the beamforming calculation. The beamforming output can be optimized by using
deconvolution approach to remove the array response function which is convolved with source signals.
With the reconstructed source spectrum generated from the deconvolved beamforming output, the
source signal is synthesized separately from tonal and broadband components.
Keywords: auralization, synthesizer, microphone arrays, beamforming, SMS
PACS no. 43.60.Fg
1 Introduction
Moving sound source auralization finds its application in environmental noise issues caused by
vehicles to achieve better noise control and subsequent traffic planning in densely populated urban and
rural areas. Auralization is an effective modern technique, which makes it possible to perceive
simulated sound intuitively instead of describing the acoustic properties with abstract numerical
quantities. Simply put, auralization converts numerical acoustic data to audible sound files, through
three procedures: sound source generation, sound propagation and reproduction [1].
The sound source generation consists of forward and backward approaches. The forward method is
based on the a priori knowledge of the sources, such as physical generation mechanism and spectral
data to obtain the source signal; while the backward method acquires the signal by inverting the
propagation procedure (e.g., directivity, Doppler Effect, spherical spreading) from the recording [2].
For multiple sources propagating waves simultaneously, especially moving sources, sound source
EuroRegio2016, June 13-15, Porto, Portugal
2
signals can’t be obtained by near field recordings in an anechoic chamber. In this case, the inverse
backward method can be utilized in moving sound source synthesis for auralization.
Beamforming is a popular sound source localization method based on the array technique [3].
However, the beamforming output in the frequency domain cannot be directly considered as the
source spectrum. It is the convolution of the spectrum and the array's point spread function (PSF).
Therefore, DAMAS (Deconvolution Approach for the Mapping of Acoustic Sources) was applied to
remove the PSF from the output [4]. Despite improvement of the acoustic image, DAMAS has the
drawback of high computational cost in handling large matrices and iterations. In order to accelerate
the beamforming process, the Kronecker Array Transform (KAT) as a fast separable transform is
possible [5], where separablesuggests that the microphones and the reconstruction grids are
distributed rectangularly and nonuniformly [6]. By using these methods, sound sources can be
precisely localized in a much reduced computational duration. In this research, beamforming is
extended to auralization as well. The positions of the moving sound sources are identified, and the
beamforming output is then used to reconstruct the source signals.
Despite the advantages mentioned above, the beamforming output spectrum cannot be directly taken
as the source signal. It is known that DAMAS and other deconvolution methods can be used to remove
the PSF. However, the PSF is rather unpredictable. For example, as sound from moving vehicles
propagates in an outdoor environment, the measurement conditions are under poorer control than in a
free-field condition in an anechoic chamber. Thus, some non-predictable acoustic effects can occur in
the meanwhile, leading to uncertain PSF. Even if the measurement conditions are well controlled and
the beamforming’s characteristics are perfectly removed, the reconstructed signal can only be used for
specific cases, in which the measurement for the particular sound sources takes place. Besides, not all
information in the source needs to be reflected in the auralization, since the human hearing system is
not sensitive enough to perceive every detail. Under these considerations, parameterization overcomes
the drawbacks mentioned above. The beamforming output spectrum can be parameterized with
excluding some unnecessary features. Spectral modelling synthesis (SMS) is a way to parameterize
and synthesize a spectrum separately in deterministic and stochastic components on the fact that most
sounds consist of these two components [7]. Parameters are generated to represent the source using
SMS. Another benefit of parameterization is that it enables variable source signals’ generation out of
one sample. By changing parameters, the sound samples are generated dynamically according to
different acoustic scenes. This provides more possibilities for auralization in the real-time virtual
reality system without having to conduct repeated psychoacoustic measurements [8].
Even though some recordings are blurred by the poorly conditioned measurement environment, the
frequencies of the tones in the spectra after de-Dopplerization can still remain correct [2]. The
frequency, amplitude and phase are obtained by peak detection and continuation. The deterministic
component is synthesized by generating sinusoidal signals with the previously mentioned parameters,
and subsequently the broadband component is represented by the subtraction of the synthesized tonal
signals from the original beamforming output spectrum.
The objective of this paper is to develop an efficient synthesizer for moving sound sources based on
microphone arrays. The sound field produced by a moving sound source is described and a de-
Dopplerization technique is introduced to prepare for beamforming. Using an array of 32
microphones, spiral and separable arrays are designed with similar resolution. The moving sound
sources are localized by applying beamforming with de-Dopplerized input signals. Furthermore,
beamforming is extended to source signal synthesis. Parameterization based on SMS utilizes
beamforming output spectrum as a sample, with which different sound samples are able to be
generated to adapt to different acoustic scenarios.
EuroRegio2016, June 13-15, Porto, Portugal
3
2 Moving sound source and de-Dopplerization
According to [9], the sound field generated by a moving sound source denotes as:


 

 
  (1)
where  is the source strength, is the derivative of ,  is the distance between source
and receiver, c is the sound speed, is the source moving speed,    is the Mach number and
 is the angle between source moving direction and source-receiver direction.
When the receiver is far away from the moving source with at a relatively low speed (normally M <
0.2), the previous equation is rewritten by omitting the second term:


  (2)
To eliminate of the Doppler Effect, the recordings need to be interpolated and re-sampled. The
reception time is calculated by    
by taking emission time as the reference time. Then the
recorded signal is interpolated and re-sampled according to the equally-spaced reception time. This
procedure is called de-Dopplerization. The de-Dopplerized signal is denoted as.
3 Microphone array design
3.1 Spiral array
Spiral array has the advantages of decreasing MSL (maximum sidelobe level) and avoiding grating
lobes over regular array [10]. In this research, an Archimedean spiral array is applied. The basic
parameters are given in Table 1. Figure 1(a) shows its layout.
Table 1 The basic parameters of the spiral microphone array
Microph
one
number
Spacing
Diameter
Resolution
Frequency
Steering angle
3 kHz
30
32
0.04-0.06 m
0.50 m
0.64 m
(a) Spiral array (b) separable array
Figure 1 Spiral and separable array layout
EuroRegio2016, June 13-15, Porto, Portugal
4
3.2 Separable array
To accelerate the beamforming and deconvolution process using KAT, separable array geometry is
necessary. To achieve comparable results, the resolutions of the spiral and separable arrays should be
similar, and the microphone numbers remain the same. Therefore, the diameter of the separable array
is set to 0.3 m, namely half of the spiral array’s size. Non-redundant array [11] is able to keep higher
resolution capability using a small number of microphones compared to a longer uniform array. In this
research, 6 microphones are taken and aligned linearly, with the spacing set to 0.02 m, 0.07 m, 0.12 m,
0.26 m and 0.30 m between microphones, as illustrated in Figure 1(b).
Extending the linear non-redundant array to two-dimensional, a 6x6 array is obtained. After
eliminating microphones in the four corners, the reduced number of the microphones remains identical
with the spiral array. Figure 2 gives the beam patterns comparison at 3 kHz. First of all, after removing
the microphones in the corners, the beam pattern remains similar. Secondly, the spiral and reduced
separable arrays share similar beam width, and so is the resolution. It is obvious that the sidelobe
levels of the separable arrays are almost 10 dB higher than that of spiral array. However, the sidelobes
are not relevant any more by applying appropriate deconvolution methods.
Figure 2 The beam pattern of the three arrays
4 Beamforming and output spectrum
4.1 Delay-and-sum beamforming
Beamforming is a general way to localize sound sources based on temporal and spatial filtering using
microphone array [12]. The most conventional beamforming is delay-and-sum beamforming (DAS),
which reinforces the signal by delaying at each microphone and adding all the signals. The output of
DAS denotes
  
 (3)
where is the number of microphones (all variables with lower case represents the th
microphone), is the weight of the output signal,  is the de-Dopplerized signal,

 
is the time delay,
is the distance between source and the microphone, and is the
distance between source and array origin.
EuroRegio2016, June 13-15, Porto, Portugal
5
4.2 DAS output spectra
A plane (1.5 m x 5 m) is moving in the x-direction, carrying two point sources at the speed of 40 m/s.
Two point sources are placed on the plane with 2 m spacing. A microphone array is set at 1.5 m away
from the moving direction. The array origin is on the z-axis (Figure 3). The plane is meshed into grids,
with 5 cm spacing between each other. Each grid represents a potential sound source, so that the array
can steer its angle to “scan” the plane to search for the sources. The left source consists of a 2 kHz
tone and noise, and the right source signal contains the same noise as in the left one. In both cases,
additive white Gaussian noise (AWGN) with SNR = 20 dB is added. The sound pressure RMS of the
tone and noise are both 1 Pa.
Figure 3 - Moving sound source measured by microphone array
The good localization ability of the deconvolution method using spiral and separable arrays has been
confirmed to be comparable [13]. Therefore, this paper only shows the color map generated by DAS.
The localization results of the two arrays at 2 kHz are shown in Figure 4. In these figures, the positions
of the two sources are (1.5, 0.75) and (3.5, 0.75).
As previously examined, the spiral array has lower sidelobe levels and the resolutions of the two
arrays are similar. The separable array has comparatively higher sidelobe levels. However, DAMAS
has the capability to remove strong sidelobes. In this sense, separable array combining KAT can be
used for beamforming. In addition, both arrays cannot resolve well at lower frequency (e.g., 1.6 kHz).
For some frequency bands, the localization deviation reaches 5 10 cm.
In terms of localization capability, the separable array can replace the spiral array with DAMAS
reducing the sidelobe levels and KAT accelerating the whole procedure.
Figure 4 Localization results of the spiral array and the separable array at 2 kHz ( ” represents the
real source position and ” represents the maximum value in the color map)
EuroRegio2016, June 13-15, Porto, Portugal
6
5 Sound source synthesis
5.1 SMS analysis and parameterization
The beamforming output spectrum representing the source is obtained by steering the array angle to
the detected source position. Short-time Fourier Transform (STFT) is conducted on these spectra
representing sources. The prominent peaks are detected in each magnitude spectrum, and then peak
continuation is tracked along all the frames in the time domain. The deterministic component is
synthesized by the sum of all the detected tones in all trajectories and frames. Afterwards, the
broadband component is modeled by the original spectrum subtracted by the synthesized tonal
component.
In SMS, there are several parameters are involved, such as window size, type, maximum peak
amplitude (MPA), maximum guide number (MGN), maximum sleeping time (MST) and maximum
peak deviation (MPD) during continuation detection [7]. Since the frequency resolution is limited
when the window size is small, interpolation is conducted in the spectra before using SMS to increase
the resolution to 1 Hz.
Figure 5 shows the beamforming output spectra of the two sources with spiral and separable arrays.
The peak in each figure is obvious due to the removal of the Doppler Effect, and the amplitudes of the
peaks are almost the same. Thus, in the sense of this synthesis step, the separable array is also
comparable to the spiral one.
Figure 5 Beamforming output spectra with array angle steering to the left source position (left: spiral
array, right: separable array)
Taking MPA as a variable, the other parameters are given in Table 2. In this paper, the MGN is
determined dynamically according to the peaks in the first frame.
Table 2 Parameters of SMS
Window size
Hop size
Window type
MST
MPD
512
16
Kaiser
3 frame
150 Hz
The results with varying MPA are shown in Figure 6. As can be seen, when MPA = 0.025 Pa, a clear 2
kHz tone can be tracked along all the frames; while for the other cases, incorrect trajectories are found.
With the source information already given in this example, the result can be verified directly. If the
source is unknown, there is no a priori knowledge. Under this condition, the parameters need to be
determined deliberately with proper simulation and verification.
EuroRegio2016, June 13-15, Porto, Portugal
7
Figure 6 Peak detection results with varying MPA
5.2 SMS Synthesis
After the determination of the peaks in each trajectory, the frequency, amplitude and phase (
 
and  respectively) are determined at the same time. Where     and      are the
frame and trajectory numbers. The synthesized tonal signal at each frame is denoted as
 
  

( 4 )
where    is the sample in each frame. Adding the synthesized signals at each frame and
trajectory gives the final representation of the sinusoidal (deterministic) signal.
Subsequent residual signal is the subtraction of the deterministic component from the original
beamforming spectrum. There is noise included during the recording, thus the level of the residual
broadband component can be reduced accordingly to approach the original source signal. Adjusting
noise is another benefit of parameterization for dynamic auralization to compensate for noise
uncertainty in the measurements.
6 Conclusions and outlook
This paper describes a synthesizer of moving sound sources based on microphone arrays and
parameterization for auralization. A de-Dopplerization technique in the time domain is introduced.
DAS uses the de-Dopplerized signals as the inputs. The spiral and separable arrays with similar
resolutions are compared in the localization and signal reconstruction parts. The aforementioned
abilities are not significantly different if deconvolution method is applied to reduce sidelobe levels. In
this regard, separable arrays can be applied instead of irregular arrays. On the one hand, separable
array allows KAT to accelerate the beamforming and deconvolution procedure; on the other hand, the
rectangular geometry of the separable array is easier to be established. DAS is then extended to source
signal reconstruction. SMS parameterized the DAS output spectrum, and the signals can be then
generated separately in deterministic and broadband components.
This research introduces the possibility to combine beamforming and spectral modeling to synthesize
moving sound sources. The results suggest that such synthesizer is validated for the specific case
discussed. The optimized parameterization procedure needs further investigation with more
simulations. Additionally, deconvolved beamforming output as the source spectrum is necessary to be
included in this synthesizer. Since sound field produced by real moving sources is more complicated,
EuroRegio2016, June 13-15, Porto, Portugal
8
on-site measurements are necessary to verify the simulation results presented in this work. In addition,
since the target sound source is auralized, it is essential to further the understanding using
psychoacoustic analysis and listening test with the synthesized signals.
Acknowledgements
The authors acknowledge Bruno Masiero from University of Campinas for the cooperation design of
the separable array.
References
[1] Vorlaender, M. Auralization: Fundamentals of Acoustics, Modelling, Simulation, Algorithms and
Acoustic Virtual Reality, Berlin, Heidelberg: Springer Berlin Heidelberg, 2007.
[2] Meng, F; Wefers, F; Vorlaender, M. Moving Sound Source Simulation Using Beamforming and
Spectral Modelling for Auralization. DAGA, 42. JAHRESTAGUNG FÜR AKUSTIK, Aachen, 14.-
17. March 2016, In CD-ROM.
[3] Van Trees, H. L.; Harry, L. Detection, estimation, and modulation theory, optimum array
processing. John Wiley & Sons, New York, USA, 2002.
[4] Brooks, T. F.; Humphreys, W. M. A deconvolution approach for the mapping of acoustic sources
(DAMAS) determined from phased microphone arrays. Journal of Sound and Vibration, 294 (4),
2006, pp 856879.
[5] Ribeiro, F. P.; Nascimento, V. H. Fast Transforms for Acoustic ImagingPart I: Theory. Image
Processing, IEEE Transactions. 20 (8), 2011, pp 22292240.
[6] Coelho, R.F., Nascimento, V.H., de Queiroz, R.L., Romano, J.M.T. and Cavalcante, C.C. Signals
and Images: Advances and Results in Speech, Estimation, Compression, Recognition, Filtering,
and Processing. CRC Press, 2015.
[7] Serra, X; Smith, J. Spectral modeling synthesis: A sound analysis/synthesis system based on a
deterministic plus stochastic decomposition. Computer Music Journal, 1990, pp 1224.
[8] Miner, N. E. A Wavelet Approach to Synthesizing Perceptually Convincing Sounds for Virtual
Environments and Multi-Media, PhD dissertation. University of New Mexico, 1998.
[9] Morse, P. M.; Ingard, K. Uno. Theoretical acoustics. Princeton: Princeton University Press, 1968.
[10] Christensen, J.J; Hald, J. Technical Review: Beamforming. Bruel & Kjaer, 2004.
[11] Vertatschitsch, E.; Haykin, S. Nonredundant arrays. Proceedings of the IEEE, 74 (1), 1986,
pp 217218.
[12] Johnson, D. H.; Dudgeon, D. E. Array signal processing: concepts and techniques. Simon &
Schuster, 1992.
[13] Ribeiro, F. P.; Nascimento, V. H. Fast Transforms for Acoustic ImagingPart II:
Applications. Image Processing, IEEE Transactions, 20 (8), 2011, pp 22412247.
... al. [72], coprime array [73] and sparse array [74]. To extend the arrays to 2D, the sparse array [74] and separable array [75,76,77] based on the nunredundant configuration have also been studied. ...
... The minimum frequency with a steering angle of 30°is around 800 Hz according to ≈ / [38], where is the speed of sound and [m] is the size of the array aperture. Fig. 3.3 illustrates the configuration of the array [77]. The basic parameters and the configuration are given in Tab is the array diameter (length), is the microphone spacing and is the distance between the array and the near-side surface of the train. ...
... al. [72], coprime array [73] and sparse array [74]. To extend the arrays to 2D, the sparse array [74] and separable array [75,76,77] based on the nunredundant configuration have also been studied. ...
... The minimum frequency with a steering angle of 30°is around 800 Hz according to ≈ / [38], where is the speed of sound and [m] is the size of the array aperture. Fig. 3.3 illustrates the configuration of the array [77]. The basic parameters and the configuration are given in Tab is the array diameter (length), is the microphone spacing and is the distance between the array and the near-side surface of the train. ...
Thesis
Full-text available
When auralizing moving sound sources in Virtual Reality (VR) environments, the two main input parameters are the location and radiated signal of the source. An array measurement-based model is developed to characterize moving sound sources regarding the two parameters in this thesis. This model utilizes beamforming, i.e. delay and sum beamforming (DSB) and compressive beamforming (CB) to obtain the locations and signals of moving sound sources. A spiral and a pseudorandom microphone array are designed for DSB and CB, respectively, to yield good localization ability and meet the requirement of CB. The de-Dopplerization technique is incorporated in the time-domain DSB to address moving source problem. Time-domain transfer functions (TDTFs) are calculated in terms of the spatial locations in the steering window of the moving source. TDTFs then form the sensing matrix of CB, thus allowing CB to solve moving source problem. DSB and CB are extended to localize moving sound sources, and the reconstructed signals from the beamforming outputs are investigated to obtain the source signals. Furthermore, the localization and signal reconstruction are evaluated through varying parameters in the beamforming procedures, i.e. steering position, steering window length and source speed of a moving periodic signal using DSB, and regularization parameter, signal to noise ratio (SNR), steering window length, array to source motion trajectory and mismatch of a moving engine signal using CB. The parameter study shows a guideline of parameter selection based on the given situations in this thesis for modeling moving source using beamforming. CB outperforms DSB in terms of signal reconstruction under particular conditions, however, the localization ability of the two algorithms is quite similar. The proposed source models are then applied on pass-by measurements of a moving loudspeaker using the designed arrays. The model using DSB and the spiral array yields acceptable deviations of the localization and signal reconstruction of the periodic signal. While for the model using the pseudorandom array and the engine noise signal, the difficulty in finding the original signal to compare with the reconstructed signal leads to large deviations. Nevertheless, CB slightly outperforms DSB in signal reconstruction, even in reconstruction which is not implied from the simulations. Finally, a framework on the array measurement-based model is introduced. With combining directivities, a complete source model can be achieved for further auralization in VR.
... The signal was truncated at 0.5 s before and after the car passes-by in-front of the microphone. The recording was provided by Fanyu Meng[125]. ...
Thesis
Full-text available
Conventionally, the quality of the urban sound environment is evaluated using assessments that involve measurements and noise maps. However, noise sources in the urban environment vary with time and cannot be assessed by equivalent sound levels and spectral content only. Moreover, natural sounds such as running water can have a positive impact on the perception of environments even though they increase the overall level of the sound field. This explains the need for auralization, which is a technique used to make the sound field of an environment audible with the presence of all sound sources. The propagation modeling for auralization can be done both with geometrical acoustics and wave-based methods. Urban environments are acoustically complex and geometrical acoustics methods have limitations in capturing modal effects and multiple diffractions from building edges inside inner city configurations, especially at low frequencies. Since noise sources such as cars in the urban environment have a low frequency content, a method that provides high accuracy at low frequencies could be needed. This can be achieved with the use of wave-based methods, but their downside is that they are computationally demanding. In this thesis both geometrical acoustics and wave-based methods have been used for auralization purposes. This thesis contains two main subjects: 1) Modeling of directivity in the wave-based acoustics pseudospectral-time domain method (PSTD). With regards to acoustic modeling and auralization, directivity has a clear influence on the perceived sound field and needs to be included in computations. 2) Designing and evaluating car pass-by auralizations, using impulse responses computed with PSTD, a hybrid geometrical acoustics method and measurements. Firstly, a method for the incorporation of directivity in PSTD is presented. PSTD is a time-domain method that provides an efficient way to solve the linear acoustics equations. First, a given frequency dependent source directivity is decomposed into spherical harmonic functions. The directive source is then implemented through spatial distributions in PSTD that relate to the spherical harmonics, and time-dependent functions are assigned to the spatial distributions in order to obtain the frequency content of the directivity. Since any directivity function can be expressed as a summation of a series of spherical harmonics, the approach can be used to model any type of directive source. The method was evaluated with two computational examples: 1) Modeling of an analytical directivity function in a 3D PSTD simulation; 2) Modeling of horizontal plane head-related transfer functions (HRTFs) in a 2D simulation. Results of the 1st example showed that this approach yields accurate directivity results in PSTD. It was also observed that the accuracy of the results is dependent on the distance of the recording point from the center of the modeled source. The average error across all angles and frequency bands was approximately 0.9 dB. In the 2nd example, horizontal plane HRTFs were modeled up to a frequency of 7.5 kHz. Almost perfect matching was achieved up to approximately 5 kHz. Secondly, a method for auralization of a car pass-by in a street is explored using PSTD, including the technique developed for the directivity modeling. The transfer paths between sound source locations and a listener are represented via binaural impulse responses, which are computed with the PSTD method. A dry synthesized car signal is convolved with the binaural impulse responses of the different locations in the street, and cross-fade windows are used in order to make the transition between the source positions smooth and continuous. The auralizations were performed for the simplified scenarios where buildings are absent, and for an environment where a long flat wall is located behind the car. A same/different listening test was carried out in order to investigate if increasing the angular spacing between the discrete source positions affects the perception of the auralizations. Signal detection theory (SDT) was used for the design and the analysis of the listening test. Results showed that differences exist, although they are difficult to notice. On average, 52.3% of the subjects found it difficult to impossible to spot any difference between auralizations with larger angular spacing (up to 10±) and the reference auralization (2± angular spacing). This auralization methodology was extended for an urban street canyon. This time the auralizations were implemented using binaural impulse responses measured inside a real street canyon and simulated with geometrical acoustics software. The same listening experiment was again conducted. Results showed that subjects could detect the difference between the reference auralization and auralizations with larger angular increments much better than in the previous case, both for the auralizations synthesized from the measurements as well as those from the simulations.
... Parameterization can decompose sounds into parameters and provide various synthesized sounds with limited number of models or measures (1). To obtain the sounds of moving sources, the backward method is more beneficial than the forward method according to (2). Parameterization can then be conducted on the sounds processed by the backward method to generate more sounds in other scenarios. ...
Conference Paper
Full-text available
Parameterization provides parameters generated from sounds, which can be updated in real time in virtual environments and thus benefits interactive auralization. When a microphone array is used to extract the sound signal of a moving source, the time window must be short enough to ensure good spatial resolution for localization. Thus, it is feasible to conduct the parameterization with its spectral data. However, short signals suffer from low frequency resolution and insufficient representation of low frequencies after the transformation from time to frequency. Therefore, this paper focuses on the comparison of time-frequency transformations, i.e. Fourier transform (FT), short-time Fourier transform (STFT) and wavelet transform (WT), on the parameterization of short signals. The parameters, e.g. frequencies and amplitudes of tonal components, and spectra of noise components are compared to evaluate the accuracy of each transformation approach. The application of time-frequency transformations in terms of different situations (frequency range, window length etc.) is discussed in the end.
Conference Paper
Moving sound source auralization has already been carried out for aircrafts, cars and trains. The sound source synthesis basically consists of forward and backward methods. The forward method is based on a priori knowledge of the source, e.g. its physical mechanism of generation and spectral data to obtain the source signal, while the backward method acquires the signal by inverting the propagation procedure (directivity, Doppler E�ect, spherical spreading etc.) from a recording. This paper utilizes beamforming as the backward method to synthesize the moving source signal in the case of beforehand unknown source information. A simple inverse propagation algorithm is not feasible when it comes to multiple sources perceived by the listener due to the di�erent transmission paths from the sources to the listener, which leads to the application of beamforming. It is conducted on the de-Dopplerized signals from the microphone array recordings to localize the sources and reconstruct the source signals. With the reconstructed source spectra, source synthesis is implemented by spectral modelling through separately synthesizing tonal and broadband components. This research finds its application in the auralization of pass-by cars and trains.
Conference Paper
Current processing of acoustic array data is burdened with considerable uncertainty. This study reports an original methodology that serves to demystify array results, reduce misinterpretation, and accurately quantify position and strength of acoustic sources. Traditional array results represent noise sources that are convolved with array beamform response functions, which depend on array geometry, size (with respect to source position and distributions), and frequency. The Deconvolution Approach for the Mapping of Acoustic Sources (DAMAS) method removes beamforming characteristics from output presentations. A unique linear system of equations accounts for reciprocal influence at different locations over the array survey region. It makes no assumption beyond the traditional processing assumption of statistically independent noise sources. The full rank equations are solved with a new robust iterative method. DAMAS is quantitatively validated using archival data from a variety of prior high-lift airframe component noise studies, including flap edge/cove, trailing edge, leading edge, slat, and calibration sources. Presentations are explicit and straightforward, as the noise radiated from a region of interest is determined by simply summing the mean-squared values over that region. DAMAS can fully replace existing array processing and presentations methodology in most applications. It appears to dramatically increase the value of arrays to the field of experimental acoustics. Copyright © 2004 by the American Institute of Aeronautics and Astronautics, Inc.
Book
This chapter introduces the basic definitions and relationships that are used to analyze and synthesize arrays. The approach is to introduce the concepts for an arbitrary array geometry. The chapter then specializes the result to a uniform linear array and then further specializes the result to a uniform weighting.
Article
Current processing of acoustic array data is burdened with considerable uncertainty. This study reports an original methodology that serves to demystify array results, reduce misinterpretation, and accurately quantify position and strength of acoustic sources. Traditional array results represent noise sources that are convolved with array beamform response functions, which depend on array geometry, size (with respect to source position and distributions), and frequency. The Deconvolution Approach for the Mapping of Acoustic Sources (DAMAS) method removes beamforming characteristics from output presentations. A unique linear system of equations accounts for reciprocal influence at different locations over the array survey region. It makes no assumption beyond the traditional processing assumption of statistically independent noise sources. A new robust iterative method seamlessly introduces a positivity constraint (due to source independence) that makes the equation system sufficiently deterministic. DAMAS is quantitatively validated using archival data from a variety of prior high-lift airframe component noise studies, including flap edge/cove, trailing edge, leading edge, slat, and calibration sources. Presentations are explicit and straightforward, as the noise radiated from a region of interest is determined by simply summing the mean-squared values over that region. DAMAS can fully replace existing array processing and presentations methodology in most applications. It appears to dramatically increase the value of arrays to the field of experimental acoustics.
Article
Spectral modeling synthesis is an analysis-based technique capable of capturing the perceptual characteristics of a wide variety of sounds. The representation that results from the analysis is intuitive and is easily mapped to useful musical parameters. The analysis part is central to the system. It is a complex algorithm that requires the manual setting of a few control parameters. Further work may automate the analysis process, particularly if there is a specialization for a group of sounds. Some aspects of the analysis are also open to further research, in particular the peak-continuation algorithm. The synthesis from the deterministic plus stochastic representation is simple and can be performed in real time with current technology. A real-time implementation of this system would allow the use of this technnique in performance.