Boaz Rafaely's research while affiliated with Ben-Gurion University of the Negev and other places

Publications (221)

Article
Full-text available
Spatial audio has been studied for several decades, but has seen much renewed interest recently due to advances in both software and hardware for capture and playback, and the emergence of applications such as virtual reality and augmented reality. This renewed interest has led to the investment of increasing efforts in developing signal processing...
Article
Full-text available
In recent years, spatial audio reproduction has been widely researched with many studies focusing on headphone-based spatial reproduction. A popular format for spatial audio is higher order Ambisonics (HOA), where a spherical microphone array is typically used to obtain the HOA signals. When a spherical array is not available, beamforming-based bin...
Article
Full-text available
In many applications, such as hearing aids and virtual reality, spatial audio is used to provide a more natural experience to the users. However, when captured in the real world, the audio signals may suffer from noise and interference. In this case, the challenge is to attenuate the undesired signals, while preserving the desired signals with thei...
Article
Full-text available
Acoustic rake filters perform coherent summation of the early room reflections using beamforming, with the aim of improving beamforming performance. This concept has been investigated for speech enhancement applications, improving noise reduction and late reverberation attenuation. Current studies typically assume that the parameters of the early r...
Chapter
Full-text available
Binaural reproduction of high-quality spatial sound has gained considerable interest with the recent technology developments in virtual and augmented reality. The reproduction of binaural signals in the Spherical-Harmonics (SH) domain using Ambisonics is now a well-established methodology, with flexible binaural processing realized using SH represe...
Article
Full-text available
The perception of sound in real-life acoustic environments, such as enclosed rooms or open spaces with reflective objects, is affected by reverberation. Hence, reverberation is extensively studied in the context of auditory perception, with many studies highlighting the importance of the direct sound for perception. Based on this insight, speech pr...
Article
Full-text available
Blind estimation of the direction of arrival (DOA) and delay of room reflections from reverberant sound may be useful for a wide range of applications. However, due to the high temporal and spatial density of early room reflections and their low power compared to the direct sound, existing methods can only detect a small number of reflections. This...
Article
Full-text available
Reproduction of high quality spatial sound has gained considerable importance with the recent technology developments in the fields of virtual and augmented reality. Recently, the reproduction of binaural signals in the Spherical-Harmonics (SH) domain has been proposed. This is performed by using SH representations of the sound-field and the Head-R...
Article
No PDF available ABSTRACT Blind estimation of the direction of arrival (DOA) and delay of room reflections from reverberant sound may be useful for a wide range of applications. However, the high temporal and spatial density of early room reflections limit existing methods to the detection of only a small number of reflections. This paper presents...
Article
Full-text available
The coherent signal subspace method (CSSM) enables the direction-of-arrival (DoA) estimation of coherent sources with subspace localization methods. The focusing process that aligns the signal subspaces within a frequency band to its central frequency is central to the CSSM. Within current focusing approaches, a direction-independent focusing appro...
Article
Full-text available
Speech enhancement in a single channel has been well studied in the literature in applications such as speech communication systems. However, in emerging applications such as virtual reality and spatial audio, in addition to attenuating undesired signals, the ability to preserve the spatial information of the desired signal captured in a noisy envi...
Article
Spatial release from masking (SRM) in the context of speech perception denotes the improvement of speech intelligibility in noise when the speech and the noise sources are spatially separated in space. A similar effect has been reported with binaural sound reproduction (BSR), in which speech and noise are simulated and reproduced using headphones....
Article
Full-text available
Sound calibration is employed in many commercial audio systems for improving sound quality. This process includes the estimation of the room transfer function (RTF) between each loudspeaker and a microphone located at the listeners' position. Current methods for RTF estimation employ calibration signals, such as noise or tones, in a dedicated proce...
Conference Paper
A recent approach to improving the robustness of sound localization in reverberant environments is based on pre-selection of time- frequency pixels that are dominated by direct sound. This approach is equivalent to applying a binary time-frequency mask prior to the localization stage. Although the binary mask approach was shown to be effective, it...
Article
Full-text available
With the proliferation of high quality virtual reality systems, the demand for high fidelity spatial audio reproduction has grown. This requires individual head-related transfer functions (HRTFs) with high spatial resolution. Acquiring such HRTFs is not always possible, which motivates the need for sparsely sampled HRTFs. Additionally, real-time ap...
Article
The importance of the information in the direct sound to human perception of spatial sound sources is an ongoing research topic. The classification between direct sound and diffuse or reverberant sound forms the basis of numerous studies in the field of spatial audio. In particular, parametric spatial audio representation methods use this classific...
Article
Full-text available
Abstract In response to renewed interest in virtual and augmented reality, the need for high-quality spatial audio systems has emerged. The reproduction of immersive and realistic virtual sound requires high resolution individualized head-related transfer function (HRTF) sets. In order to acquire an individualized HRTF, a large number of spatial me...
Chapter
Motivated by the problem of spatial sampling of a sound field by a spherical array, Chap. 3 presented methods for sampling functions on a sphere, followed by methods for reconstructing a function from its samples. These could form the basis for computing the sound pressure on the surface of a sphere, given measurements by an array of microphones. H...
Chapter
The mathematical background for functions defined on the unit sphere was presented in Chap. 1. Spherical harmonics played an important role in presenting and manipulating these functions. In this chapter, functions on the sphere are defined through the formulations of fields in three dimensions. Although sound fields are of primary concern in this...
Chapter
This chapter provides the mathematical background necessary for studying spherical array processing. Spherical arrays typically sample functions on a sphere (e.g. sound pressure); therefore, this chapter begins by presenting the spherical coordinate system, as well as some examples of functions on the sphere. Spherical harmonics are a central theme...
Chapter
Spherical microphone arrays are realized by placing microphones in three-dimensional space and recording the signals at the microphone locations. When the microphones are placed on the surface of a sphere, they sample the sound pressure at the sphere surface. Estimation of the sound pressure function on the measurement sphere may depend on the samp...
Chapter
Beamforming with spherical microphone arrays was presented in Chap. 5 as an instrument to achieve directional filtering, characterized by the beam pattern of the array. It may be desired to control the beam pattern in a more explicit manner to achieve specific properties. For example, beamformers that achieve maximum directivity index may be useful...
Chapter
Optimal beamformer design, as presented in Chap. 6, may be very useful, but does not take into account the properties of the specific sound field producing the signals at the microphones. In this chapter, beamforming in which the beam pattern is tailored to the actual sound field is presented. This beamforming distinguishes between the desired sign...
Chapter
Chapter 4 presented various ways to configure a spherical microphone array and discussed the advantages of each configuration. Once microphones are positioned in space in a desired configuration, e.g. on the surface of a rigid sphere, they can be connected to conditioning equipment, and the signal at each microphone can be recorded. In this chapter...
Preprint
This paper summarizes the methods used to localize the sources recorded for the LOCalization And TrAcking (LOCATA) challenge. The tasks of stationary sources and arrays were considered, i.e., tasks 1 and 2 of the challenge, which were recorded with the Nao robot array, and the Eigenmike array. For both arrays, direction of arrival (DOA) estimation...
Article
Direction of arrival (DOA) estimation for speech sources is an important task in audio signal processing. This task becomes a challenge in reverberant environments, which are typical to real scenarios. Several methods of DOA estimation for speech sources have been developed recently, in an attempt to overcome the effect of reverberation. One effect...
Chapter
This paper introduces a framework for robust speaker localization in reverberant environments based on a causal analysis of the temporal relationship between direct sound and corresponding reflections. It extends previously proposed localization approaches for spherical microphone arrays based on a direct-path dominance test. So far, these methods...
Conference Paper
Full-text available
High-fidelity 3D audio experience requires accurate individual head-related transfer function (HRTF) representation. However, the process of measuring individual HRTFs typically involves measurements from hundreds of directions, with specialized and expensive equipment, which makes this process inaccessible for most users. In this paper, a new tech...
Presentation
With the increased popularity of virtual reality applications, the need for high fidelity spatial audio has emerged. Reproduction of high quality spatial audio requires high resolution individualized head-related transfer functions (HRTFs). However, these are typically unavailable as they demand a large number of measurements and specialized equipm...
Presentation
High-quality spatial sound reproduction is important for many applications of virtual and augmented reality. A key component in spatial audio reproduction is the head-related transfer function (HRTF). To achieve a realistic spatial audio experience, in terms of sound localization and externalization, high resolution individualized HRTFs are necessa...
Article
With the recent proliferation of spherical microphone arrays for sound field recording, methods have been developed for rendering binaural signals from these recordings and free-field head related transfer functions (HRTFs). Employing spherical arrays naturally leads to methods that are formulated in the spherical harmonics (SH) domain, using order...
Article
Spatial analysis of room acoustics is an ongoing research topic. Microphone arrays have been employed for spatial analyses with an important objective being the estimation of the direction-of-arrival (DOA) of direct sound and early room reflections using room impulse responses (RIRs). An optimal method for DOA estimation is the multiple signal clas...
Chapter
This chapter presents an overview of the conventional plane-wave decomposition (PWD) method for spherical microphone arrays, as well as some recent advances that aim to provide improved PWD estimations. It starts with the conventional PWD method and describes the sources of noise amplification and spatial aliasing errors. The PWD methods differ in...
Article
Binaural sound reproduction (BSR) can improve speech intelligibility due to spatial release from masking (SRM), in the case where the signals are reproduced in such a manner that they are perceived as arriving to the listener from different directions. However, the effect of BSR on speech intelligibility has not yet been thoroughly studied for diff...
Article
Estimation of the direction-of-arrival (DoA) of a speaker in a room is important in many audio signal processing applications. Environments with reverberation that masks the DoA information are particularly challenging. Recently, a DoA estimation method that is robust to reverberation has been developed. This method identifies time-frequency bins d...
Article
Full-text available
The synthesis of binauralsignals from spherical microphone array recordings has been recently proposed. The limited spatial resolution of the reproduced signal due to order-limited reproduction has been previously investigated perceptually, showing spatial perception ramifications, such as poor source localization and limited externalization. Furth...
Presentation
Growing interest in virtual reality has led to greater demand for immersive virtual audio systems. High fidelity spatial audio requires individualized head related transfer functions (HRTFs). Individualized HRTFs are, however, typically unavailable as they require specialized equipment and a large number of measurements. This motivates the developm...
Article
Estimation of the direction of arrival (DoA) of speakers in reverberant environments is an important audio signal processing task in a wide range of applications. Recently, a reverberation-robust method for DoA estimation has been developed. It is based on the identification of time-frequency bins that are dominated by the direct path from the sour...
Presentation
Previous studies have shown that individualized head related transfer functions (HRTFs) provide improved localization performance compared to generic HRTF filters, and are therefore considered preferable for binaural sound reproduction. However, individualized HRTFs typically require a large number of measurements, which may extend to several hours...
Conference Paper
p>Accurate estimation of the Direction of Arrival (DOA) of a sound source is an important prerequisite for a wide range of acoustic signal processing applications. However, in enclosed environments, early reflections and late reverberation often lead to localization errors. Recent work demonstrated that improved robustness against reverberation can...
Article
Spherical microphone arrays (SMAs) and spherical loudspeaker arrays (SLAs) facilitate the study of room acoustics due to the three-dimensional analysis they provide. More recently, systems that combine both arrays, referred to as multiple-input multiple-output (MIMO) systems, have been proposed due to the added spatial diversity they facilitate. Th...
Article
Methods are proposed for modifying the reverberation characteristics of sound fields in rooms by employing a loudspeaker with adjustable directivity, realized with a compact spherical loudspeaker array (SLA). These methods are based on minimization and maximization of clarity and direct-to-reverberant sound ratio. Significant modification of reverb...
Article
The spatially localized spherical Fourier transform has been studied for various applications in recent years. One of the main arguments of the transform is the window function, typically selected to provide localization in space and to enable spectral analysis of specific parts of the sphere. In the discrete formulation of the transform, window fu...
Conference Paper
Beamforming using spherical arrays has become increasingly popular in recent years. However, the performance of beamforming algorithms is greatly affected by the limited number of sensors. This work offers a novel approach based on pre-processing of the spatial data in order to better separate the signal from noise, thus improving beamforming perfo...
Conference Paper
In acoustic conditions with reverberation and coherent sources, various spatial filtering techniques, such as the linearly constrained minimum variance (LCMV) beamformer, require accurate estimates of the relative transfer functions (RTFs) between the sensors with respect to the desired speech source. However, the time-domain support of these RTFs...
Conference Paper
Signals recorded by microphones form the basis for a wide range of audio signal processing systems. In some applications, such as humanoid robots, the microphones may be moving while recording the audio signals. A common practice is to assume that the microphone is stationary within a short time frame. Although this assumption may be reasonable und...
Article
Microphone arrays are widely used in speech enhancement systems for noisy and reverberant environments. Recently, a generalized spherical array beamforming approach was developed incorporating binaural sound reproduction in the beamforming process. This generalized spherical array beamformer (GSB) maintains the spatial information through the binau...
Article
Spatial attributes of room acoustics have been widely studied using microphone and loudspeaker arrays. However, systems that combine both arrays, referred to as multiple-input multiple-output (MIMO) systems, have only been studied to a limited degree in this context. These systems can potentially provide a powerful tool for room acoustics analysis...
Article
Full-text available
The auditory system of humanoid robots has gained increased attention in recent years. This system typically acquires the surrounding sound field by means of a microphone array. Signals acquired by the array are then processed using various methods. One of the widely applied methods is direction of arrival estimation. The conventional direction of...
Article
Rendering binaural signals from spherical microphone recordings is becoming an increasingly popular approach, with applications in telecommunications, virtual acoustics, hearing science, and entertainment. Such binaural signals can be generated from a plane-wave decomposition of a sound field measured by a spherical microphone array. This process m...
Article
Full-text available
Due to its efficiency and simplicity, the finite difference time domain method is becoming a popular choice for solving wideband, transient problems in various fields of acoustics. So far, the issue of extracting a binaural response from finite difference simulations has only been discussed in the context of embedding a listener geometry in the gri...
Conference Paper
The accuracy of direction of arrival estimation tends to degrade under reverberant conditions due to the presence of reflected signal components which are correlated with the direct path. The recently proposed direct-path dominance test provides a means of identifying time-frequency regions in which a single signal path is dominant. By analysing on...
Conference Paper
This paper focuses on speaker tracking in robot audition for human-robot interaction. Using only acoustic signals, speaker tracking in enclosed spaces is subject to missing detections and spurious clutter measurements due to speech inactivity, reverberation and interference. Furthermore, many acoustic localization approaches estimate speaker direct...
Article
Processing of microphone arrays of various configurations involves mathematical models of the surrounding sound fields. These models are based on different space and time conventions found throughout the literature. In traditional open array processing, interchanging different space and time conventions can lead to confusion between the arrival and...
Presentation
Binaural technology has various applications in virtual acoustics, architectural acoustics, tele-communications, and auditory science. One key element in binaural technology is the binaural room impulse response (BRIR), which represents a continuum of plane waves spatially filtered by head related transfer functions (HRTFs). Such BRIRs can be rende...
Article
The perception of sound by human listeners in a room has been shown to be affected by the spatial attributes of the sound field. These spatial attributes have been studied using microphone and loudspeaker arrays separately. Systems that combine both loudspeaker and microphone arrays, termed multiple-input multiple-output (MIMO) systems, facilitate...
Conference Paper
Auditory systems of humanoid robots usually acquire the surrounding sound field by means of microphone arrays. These arrays can undergo motion related to the robot’s activity. The conventional approach to dealing with this motion is to stop the robot during sound acquisition. This approach avoids changing the positions of the microphones during the...
Article
Binaural responses can be rendered from a plane-wave decomposition of a measured or a modeled sound field, spatially integrated with free-field head-related transfer functions. When represented in the spherical-harmonics domain, the decomposition order reflects the maximum spatial resolution, which is limited by the number of microphones in the sph...
Article
Measured values of acoustic absorption often vary between the laboratory and the field due to deficiencies in standard measurement methods. This paper introduces a new method of measuring acoustic absorption in the field using a spherical microphone array. Plane-wave decomposition is used to separate direct energy from reflected energy when the arr...
Chapter
This chapter provides the mathematical background necessary for studying spherical array processing. Spherical arrays typically sample functions on a sphere (e.g. sound pressure); therefore, this chapter begins by presenting the spherical coordinate system as well as some examples of functions on the sphere. Spherical harmonics are a central theme...
Chapter
Chapter 4 presented various ways to configure a spherical microphone array and discussed the advantages of each configuration. Once microphones are positioned in space in a desired configuration, e.g. on the surface of a rigid sphere, they can be connected to conditioning equipment, and the signal at each microphone can be recorded. In this chapter...
Chapter
Beamforming with spherical microphone arrays was presented in Chap. 5 as an instrument to achieve directional filtering, characterized by the beam pattern of the array. It may be desired to control the beam pattern in a more explicit manner to achieve specific properties. For example, beamformers that achieve maximum directivity index may be useful...
Chapter
Motivated by the problem of spatial sampling of a sound field by a spherical array, Chap. 3 presented methods for sampling functions on a sphere, followed by methods for reconstructing a function from its samples. These could form the basis for computing the sound pressure on the surface of a sphere, given measurements by an array of microphones. H...
Chapter
The mathematical background for functions defined on the unit sphere was presented in Chap. 1. Spherical harmonics played an important role in presenting and manipulating these functions. In this chapter, functions on the sphere are defined through the formulations of fields in three dimensions. Although sound fields are of primary concern in this...
Chapter
Optimal beamformer design, as presented in Chap. 6, may be very useful, but does not take into account the properties of the specific sound field producing the signals at the microphones. In this chapter, beamforming in which the beam pattern is tailored to the actual sound field is presented. This beamforming distinguishes between the desired sign...
Chapter
Spherical microphone arrays are realized by placing microphones in three-dimensional space and recording the signals at the microphone locations. When the microphones are placed on the surface of a sphere, they sample the sound pressure at the sphere surface. Estimation of the sound pressure function on the measurement sphere may depend on the samp...
Conference Paper
Full-text available
One of the important tasks of a humanoid-robot auditory system is speaker localization. It is used for the construction of the surrounding acoustic scene and as an input for additional processing methods. Localization is usually required to operate indoors under high reverberation levels. Recently, an algorithm for speaker localization under these...
Article
Reconstruction of binaural room impulse responses (BRIRs) from spherical microphone array measurements in a room and a given head-related transfer function set, is beneficial for binaural reproduction with listener individualization, and for applying head rotations without needing to make numerous measurements of the sound field. Such algorithms of...
Article
Full-text available
An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on...
Article
One of the major challenges encountered when localizing multiple speakers in real world environments is the need to overcome the effect of multipath distortion due to room reverberation. A wide range of methods has been proposed for speaker localization, many based on microphone array processing. Some of these methods are designed for the localizat...
Conference Paper
Full-text available
The finite difference time domain method has direct applications in musical instrument modeling, simulation of environmental acoustics, room acoustics and sound reproduction paradigms, all of which benefit from auralization. However, rendering binaural impulse responses from simulated data is not straightforward to accomplish as the calculated pres...
Conference Paper
Circular microphone arrays facilitate spatial processing and analysis of sound fields in applications where the sound field sources are primarily expected from the azimuthal directions. The operating frequency bandwidth of the array depends on the array aperture and on the number of microphones. At high frequencies, spatial aliasing generates side-...
Conference Paper
Microphone array processing and beamforming methods are widely used to enhance speech signals in noisy and reverberant environments. Recently, a generalized spherical array beamformer (GSB) was introduced incorporating beamforming with binaural sound reproduction. This GSB improves both the spatial realism and the speech intelligibility in the repr...
Conference Paper
A planar wavefront assumption facilitates significant simplifications in microphone array processing algorithms because information on the distance of the source is not required. This assumption was studied and formulated for spherical microphone arrays and point sources, but has not been studied yet for directional sources, which represent realist...
Conference Paper
Full-text available
A recent and fast evolving application for microphone arrays is the auditory systems of humanoid robots. These arrays, in contrast to conventional arrays, are not fixed in a given position, but move together with the robot. While imposing a challenge to most conventional array processing algorithms, this movement offers an opportunity to enhance pe...
Conference Paper
Full-text available
The technique of rendering binaural room impulse responses from spatial data captured by spherical microphone arrays has been recently proposed and investigated perceptually. The finite spatial resolution enforced by the microphone configuration restricts the available frequency bandwidth and, accordingly, modifies the perceived timbre of the playe...
Article
Microphone arrays are used in speech signal processing applications such as teleconferencing and telepresence, in order to enhance a desired speech signal in the presence of speech signals from other speakers, reverberation and background noise. These arrays usually provide a single-channel output, so that no spatial information is available in the...
Article
The technique of rendering binaural room impulse responses from spatial data captured by spherical microphone arrays has been recently proposed and investigated perceptually. The finite spatial resolution enforced by the microphone configuration restricts the available frequency bandwidth and, accordingly, modifies the perceived timbre of the playe...

Citations

... Playable sound clips are directly embedded in the PDF and HTML versions of the article in the places indicated by authors. For a demonstrative example, we refer to [1]. Acta Acustica offers the possibility to organize Topical Issues, where the organizers become guest editors to handle the review process. ...
... lowing [13]. As a standard practice in multichannel speech processing, we consider beamformed audio that is derived from the four non-binaural channels as input [5,13,18]. We use a maximally directive beamformer formulation that is optimized using a minimum-variance distortionlessresponse algorithm with a diffuse noise covariance and anechoic steering vector. ...
... In more recent studies, the beamforming approach developed in previous work was applied to arrays of arbitrary configuration, such as arrays mounted on helmets [142], or glasses [143], linear arrays [43,144], and wall mounted planar arrays [145]. These recent studies extended previous work which was mostly developed for Ambisonics signals. ...
... Metaverse runs in a variety of places, from a relatively quiet house to a space where a variety of people gathers. Donley et al. [251] proposed Linearly Constrained Minimum Variance (LCMV), an automated solution for multi-channel signal enhancement to improve voice communication in a noisy environment. They use the beamformer to estimate the relative source contribution of each source in the mixture and then used to weight statistical estimates of the spatial properties of each source used for the final separation. ...
... Finally, the proposed upsampling is possibly not restricted to signals obtained from spherical microphone arrays but might be applicable to microphone arrays with arbitrary geometry. The proposed upsampling algorithm could thus also improve other beamforming based binaural reproductions such as [49], [50], [51]. ...
... Since the ear-aligned HRTFs have less energy at higher orders, HRTF order truncation becomes less critical. A method for enabling head-tracked binaural synthesis wi