Fig 3 - uploaded by Xin Chang
Content may be subject to copyright.
The GCC algorithm provides a good accuracy with more than 2 microphones; however, increasing their amount might also increase the error. 

The GCC algorithm provides a good accuracy with more than 2 microphones; however, increasing their amount might also increase the error. 

Source publication
Conference Paper
Full-text available
Sound source localization in real time can be employed in numerous applications such as filtering, beamforming, security system integration, etc. Algorithms employed in this field require not only fast processing speed but also enough accuracy to properly cope with the application requirements. This work presents accuracy benchmarks of a hybrid app...

Context in source publication

Context 1
... from the already stated when considering Figure 2, it can be per- ceived that point coordinates (stored according to their shown tag number) to be accessed are not in specific order, thus, associated delays might also be ac- counted in this hybrid approach. Nontheless, the costs associated to the previous tasks is low, therefore, the processing optimization of the hybrid approach is still considerable. Other solutions to address the latter aspect can be storing point coordinates by columns order rather than rows. The following subsection will present our accuracy estimations of the proposed algorithm. The hybrid approach has two sources of error: the one introduced by the GCC algorithm when obtaining the angle of arrival, and the one inherent to the DSB- SRP itself. Because both algorithms can yield errors at the same time we present the results considering them both. For all the following simulations, a mean value is presented as the final result, since we varied target position for a better gen- eralization of the results. We ran a first study related to GCC precision using the aforementioned toolbox [13]; after considering different scenarios for linear arrays, we obtained an average error of 2 − 4 ◦ for arrays of at least 4 cms long with more than 3 microphones. Arrays that did not respect such conditions yielded up to 16 ◦ errors. This results can are shown in Figure 3; when the number of microphones increases, so does the error. Since every pair of microphones provide an estimated angle, the error increase can be caused by the averaging of all measured values. The hybrid algorithm was analysed in terms of the GCC error; a few parameters can be changed to tune up its output but according to our results, only some of them are relevant. Initially, we varied parameter ε and measured the distance of the detected point to the real source location. The test was performed under a configuration of 8 microphones, 50 cm apart, and assumming a GCC error range of [ − 30 ◦ , 30 ◦ ]; we obtained Figure 4. The horizontal red line represents the output when using the DSB-SRP, while the black lines show the behavior of the GCC-DSB for different epsilon values; when the GCC angle error is greater than the value of epsilon, accuracy tends to be rapidly lost. On the other hand, if the GCC error lies whithin the range comprised by epsilon, the accuracy is the same as with the DSB-SRP. Through the simulations performed, we noticed that one of the most important parameters that affected localization accuracy was the microphone spacing; for instance, consider Figure 5, where the same scenario is presented for three different spacings; in Figure 5a the microphones are only 10 cm apart and a big red fringe of points are detected as possible source location. On the contrary, Figures 5b and 5c show that when increasing such separation, the energy plot is much clearer, thus enabling better accuracy. Since more than one point can be detected as the maximum, we decided to select the mean point between all of them as source position. In Figure 6 the algorithm is tested with 12 microphones and a 10 ◦ epsilon, changing their spacing. In Figure 6a we can see the amount of points that are detected for three different cases; although the number of maximums between 20 and 50 cms does not change much, the difference in array size greatly does. In accordance with this result, yet with smaller differences, Figure 6b shows that the accuracy improves for longer spaced arrays. Once more, the validity of the result holds as long as the GCC error is less than the value of epsilon. A final study was carried on to understand the behavior of the algorithms in relatively small arrays; experimental results using the aforementioned database showed that for arrays of 10 cms long or less, even with no GCC angle error, is not possible to obtain a good precision. According to our estimations, sufficiently accurate results can be obtained with arrays of at least 20 cms long; although, as shown in Figure 5, a considerable amount of maximums will be detected, on average, good estimations can still be obtained with this configuration. Our results show that precision obtained tends to slightly improve with more microphones but is practically the same in all cases. Through the analysis performed in this work, we could verify that the proposed hybrid algorithm can perform as accurate as the traditional DSB-SRP with linear microphone arrays. An important factor that can drastically change the output from the algorithm is the error induced by the GCC-PHAT algorithm; when the angle error is greater than the value of parameter epsilon, the algorithm ...

Similar publications

Preprint
Full-text available
High-density electroencephalography (hdEEG) is an emerging brain imaging technique that can permit investigating fast dynamics of cortical electrical activity in the healthy and the diseased human brain. Its applications are however currently limited by a number of methodological issues, among which the difficulty in obtaining accurate source local...
Article
Full-text available
Disaster risk reduction in rural Africa can contribute to reducing poverty and food insecurity if included in local development plans (LDPs). Five years after the Sendai Framework for Disaster Risk Reduction (DRR), we do not know how much risk reduction is practiced in rural Africa. The aim of this assessment is to ascertain the state of mainstream...
Article
Full-text available
Inhibitory control deficits are a hallmark in ADHD. Yet, inhibitory control includes a multitude of entities (e.g. ‘inhibition of interferences’ and ‘action inhibition’). Examining the interplay between these kinds of inhibitory control provides insights into the architecture of inhibitory control in ADHD. Combining a Simon task and a Go/Nogo task,...
Conference Paper
Full-text available
This paper presents a novel and efficient two-dimensional direction-of-arrival method to estimate wideband sources directions from L-shaped microphone arrays. The azimuth and elevation of wideband DOA are estimated simultaneously by using a special case of cross-correlation matrix based on distinct frequency. The performance is evaluated in terms o...

Citations

Article
Humans make extensive use of auditory cues to interact with other humans, especially in challenging real-world acoustic environments. Multiple distinct acoustic events usually mix together in a complex auditory scene. The ability to separate and localize mixed sound in complex auditory scenes remains a demanding skill for binaural robots. In fact, binaural robots are required to disambiguate and interpret the environmental scene with only two sensors. At the same time, robots that interact with humans should be able to gain insights about the speakers in the environment, such as how many speakers are present and where they are located. For this reason, the speech signal is distinctly important among auditory stimuli commonly found in human-centered acoustic environments. In this paper, we propose a Bayesian method of selectively processing acoustic data that exploits the characteristic amplitude envelope dynamics of human speech to infer the location of speakers in the complex auditory scene. The goal was to demonstrate the effectiveness of this speech-specific temporal dynamics approach. Further, we measure how effective this method is in comparison with more traditional methods based on amplitude detection only.