
Sebastià V. Amengual GaríReality Labs Research @ Meta - Oculus Research
Sebastià V. Amengual Garí
Dr.-Ing.
Audio Research Scientist @ Reality Labs Research
About
70
Publications
64,700
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
859
Citations
Introduction
Doing audio research for AR/VR - perception, room acoustics, spatial audio - at Facebook Reality Labs (formerly Oculus Research).
Publications
Publications (70)
Moving around in the world is naturally a multisensory experience , but today's embodied agents are deaf-restricted to solely their visual perception of the environment. We introduce audiovisual navigation for complex, acoustically and visually realistic 3D environments. By both seeing and hearing, the agent must learn to navigate to a sounding obj...
The spatial decomposition method (SDM) can be used to parameterize and reproduce a sound field based on measured multichannel room impulse responses (RIRs). In this paper we propose optimizations of SDM to address the following questions and issues that have recently emerged in the development of the method: (a) accuracy in direction-of-arrival (DO...
The full text is available herhttps://www.aes.org/tmpFiles/elib/20220820/21850.pdf .
Understanding perceived room acoustical similarity is crucial to generate perceptually optimized audio rendering algorithms that maximize the perceived quality while minimizing the computational cost. In this paper we present a perceptual study in which listeners c...
This paper presents a dataset of spatial room impulse responses (SRIRs) and 360°stereoscopic video captures of a variable acoustics laboratory. A total of 34 source positions are measured with 8 different acoustic panel configurations, resulting in a total of 272 SRIRs. The source positions are arranged in 30° increments at concentric circles of ra...
We present an acoustic navigation experiment in virtual reality (VR), where participants were asked to locate and navigate towards an acoustic source within an environment of complex geometry using only acoustic cues. We implemented a procedural generator of complex scenes, capable of creating environments of arbitrary dimensions, multiple rooms, a...
Experiments testing sound for augmented reality can involve real and virtual sound sources. Paradigms are either based on rating various acoustic attributes or testing whether a virtual sound source is believed to be real (i.e., evokes an auditory illusion). This study compares four experimental designs indicating such illusions. The first is an AB...
The task of Novel View Acoustic Synthesis (NVAS) - generating Room Impulse Responses (RIRs) for unseen source and receiver positions in a scene - has recently gained traction, especially given its relevance to Augmented Reality (AR) and Virtual Reality (VR) development. However, many of these efforts suffer from similar limitations: they infer RIRs...
The authors present a perceptual evaluation of the binaural rendering quality of signals from several types of baffled microphone arrays. They employ the multi-stimulus category rating (MuSCR) paradigm that does not require a reference stimulus. The tested conditions also comprise a very high numerical accuracy stimulus, given the highest quality r...
Smart glasses are increasingly recognized as a key medium for augmented reality, offering a hands-free platform with integrated microphones and non-ear-occluding loudspeakers to seamlessly mix virtual sound sources into the real-world acoustic scene. To convincingly integrate virtual sound sources, the room acoustic rendering of the virtual sources...
Full text available at https://eurasip.org/Proceedings/Eusipco/Eusipco2024/pdfs/0000116.pdf --------------------------------
Spatial room impulse responses (SRIRs) facilitate the rendering of virtual sound sources to realistically augment a real-world acoustic scene, for example in the context of augmented reality. With the integration of microphon...
This paper presents a method for sound field interpolation/extrapolation from a spatially sparse set of binaural room impulse responses (BRIRs). The method focuses on the direct component and early reflections, and is framed as an inverse problem seeking the weight signals of an acoustic model based on the time-domain equivalent source (TES). Once...
Full text available at: https://research.chalmers.se/publication/542048/file/542048_Fulltext.pdf
----------------------
In auditory augmented reality applications, virtual sound sources can be added to a real-world acoustic environment by processing each source signal with a spatial room impulse response (SRIR) to render acoustic characteristics of...
This paper is concerned with the spatial interpolation of Binaural Room Transfer Functions (BRTFs). The proposed method is a binaural extension of Room Transfer Function (RTF) interpolation methods framed as inverse problems, and is based on a parametric representation of the sound field using either Plane Waves (PWs) or Equivalent Sources (ESs). O...
Virtuelle Akustik kann Personen auditiv in eine andere Umgebung versetzen, virtuelle Elemente zur realen Umgebung hinzufügen oder diese für den/die Nutzer:in modifiziert erscheinen lassen. Inzwischen erzielt die Virtuelle Akustik dabei perzeptiv so überzeugende Ergebnisse, dass sie sowohl als Werkzeug in der Forschung als auch als Technologie in ko...
This article formulates and evaluates four different methods for six-degrees-of-freedom binaural reproduction of head-worn microphone array recordings, which may find application within future augmented reality contexts. Three of the explored methods are signal-independent; utilising: least-squares, magnitude least-squares, or plane wave decomposit...
Various robot systems have been proposed in the past to automate the tedious and time-consuming room acoustic measurement process. While small-scale measurements within a limited area can be realized with robotic arms, room-scale measurements require robots that can travel larger distances and ideally navigate through their environment autonomously...
The torso and shoulder affect the head-related transfer function (HRTF) by means of reflection and diffraction. The reflection is strongest if the ear, source, and shoulder are approximately aligned and superimposes a comb-filter upon the HRTF magnitude spectrum that can have a depth of up to 5 dB above approximately 700 Hz. In case the direct soun...
Psychoacoustic experiments have shown that directional properties of the direct sound, salient reflections, and the late reverberation of an acoustic room response can have a distinct influence on the auditory perception of a given room. Spatial room impulse responses (SRIRs) capture those properties and thus are used for direction-dependent room a...
Rendering perceptually plausible sound propagation in augmented reality (AR) requires matching the acoustic properties of the virtual signals with those of the listening space in which the user is located. In this paper, we review several approaches for the characterization of room acoustics with the goal of auralization in AR applications so that...
The spatial decomposition method (SDM) is a parametric approach for the processing of spatial room impulse responses. Although extensively used, there are some issues related to used microphone array, reproduction loudspeaker setup, and signal processing that are not widely understood. For example, there have been different observations with regard...
pdf: https://research.chalmers.se/publication/532369/file/532369_Fulltext.pdf
Spherical harmonic (SH) representations of sound fields are usually obtained from microphone arrays with rigid spherical baffles whereby the microphones are distributed over the entire surface of the baffle. We present a method that overcomes the requirement for the baf...
Six-degrees-of-freedom rendering of an acoustic environment can be achieved by interpolating a set of measured spatial room impulse responses (SRIRs). However, the involved measurement effort and computational expense are high. This work compares novel ways of extrapolating a single measured SRIR to a target position. The novel extrapolation techni...
Microphone arrays consisting of sensors mounted on the surface of a rigid, spherical scatterer are popular tools for the capture and binaural reproduction of spatial sound scenes. However, microphone arrays with a perfectly spherical body and uniformly distributed microphones are often impractical for the consumer sector, in which microphone arrays...
Psychoacoustic experiments have shown that directional properties of the direct sound, salient reflections, and the late reverberation of an acoustic room response can have a distinct influence on the auditory perception of a given room. Spatial room impulse responses (SRIRs) capture those properties and thus are used for direction-dependent room a...
For the evaluation of virtual acoustics for mixed realities, we distinguish between the paradigms 'authenticity', 'plausibility' and 'transfer-plausibility'. In the case of authenticity, discrimination tasks between real sound sources and virtual renderings presented over headphones are performed, whereas in case of a plausibility experiment, liste...
Additive noise produced by the recording hardware will contribute to streamed signals from spherical microphone arrays under practical conditions. For the application of binaural reproduction and under the assumption that the noise is uncorrelated between the array channels, the spectral properties and the overall level of the rendered noise in the...
We propose a method for the decomposition of measured directional room impulse responses (DRIRs) into prominent reflections and a residual. The method comprises obtaining a fingerprint of the time-frequency signal that a given reflection carries, imposing this time-frequency fingerprint on a plane-wave prototype that exhibits the same propagation d...
We recently presented a method for obtaining a spherical harmonic representation of a sound field based on microphones along the equator of a rigid spherical object that ideally has a size similar to that of a human head. We refer to this setup as equatorial microphone array. Even more recently, we presented an extension of this method that allows...
Find a pdf here: http://www.soundfieldsynthesis.org/wp-content/uploads/pubs/Ahrens_etal_JASA2021.pdf
We present a method for computing a spherical harmonic representation of a sound field based on observations of the sound pressure along the equator of a rigid spherical scatterer. Our proposed solution assumes that the captured sound field is heig...
Parametric spatial audio rendering is a popular approach for low computing capacity applications, such as augmented reality systems. However most methods rely on spatial room impulse responses (SRIR) for sound field rendering with 3 degrees of freedom (DoF), i.e., for arbitrary head orientations of the listener, and often require multiple SRIRs for...
In augmented reality applications, where room geometries and material properties are not readily available, it is desirable to get a representation of the sound field in a room from a limited set of available room impulse response measurements. In this paper, we propose a novel method for 2D interpolation of room modes from a sparse set of RIR meas...
Reverberation is essential for the realistic auralisation of enclosed spaces. However, it can be computationally expensive to render with high fidelity and, in practice, simplified models are typically used to lower costs while preserving perceived quality. Ambisonics-based methods may be employed to this purpose as they allow us to render a reverb...
Given only a few glimpses of an environment, how much can we infer about its entire floorplan? Existing methods can map only what is visible or immediately apparent from context, and thus require substantial movements through a space to fully map it. We explore how both audio and visual sensing together can provide rapid floorplan reconstruction fr...
The full text is available here https://hal.archives-ouvertes.fr/hal-03235341/document .
Spherical microphone arrays are used to capture spatial sound fields, which can then be rendered via headphones. Insight into perceptual properties of sensor self-noise is valuable in the design and construction process of such arrays. We use the Real-Time Sp...
The spatial decomposition method (SDM) aims at parameterizing a sound field as a succession of plane waves, allowing the analysis and rendering of multichannel room impulse responses (RIRs). The method was originally developed for the use with open microphone arrays, utilizing time differences of arrival to compute directional estimates. A later ve...
https://doi.org/10.5281/zenodo.4007387 ------
This manuscript presents an open source database of Spatial Room Impulse Responses (SRIR) captured at three different performance spaces of the Detmold University of Music. It includes one medium sized concert hall (Detmold Konzerthaus), one chamber music room (Brahmssaal) and one theater (Detmold Somm...
This chapter reviews the basics of music and room acoustics perception, an overview of auralization methods for the investigation of music performance and a series of studies related to the impact of room acoustics on listeners and musicians. The acoustics of the performance environment play a major role for musicians, both during rehearsals and co...
Spherical microphone arrays are used to capture spatial sound fields, which can then be rendered via headphones. We use the Real-Time Spherical Array Renderer (ReTiSAR) to analyze and auralize the propagation of sensor self-noise through the processing pipeline. An instrumental evaluation confirms a strong global influence of different array and re...
We recently presented ReTiSAR, a framework for binaural rendering of spherical microphone array data in real-time. The array signals and the employed head-related transfer functions are processed in the spherical harmonics domain to compute the resulting ear signals and virtually place a listener into the captured sound field. In this contribution,...
Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf - restricted to solely their visual perception of the environment. We introduce audio-visual navigation for complex, acoustically and visually realistic 3D environments. By both seeing and hearing, the agent must learn to navigate to an audio-bas...
A method is proposed here to synthesize the acoustic response of a room to a musical reed wind instrument with tone holes played by a musician. The procedure uses convolution of a) two measured pulse responses and b) the mouthpiece pressure during playing. The novelty of the approach is to include the sound radiation directivity of the source in th...
Augmented reality has the potential to connect people anywhere, anytime, and provide them with interactive virtual objects that enhance their lives. To deliver contextually appropriate audio for these experiences, a much greater understanding of how users will interact with augmented content and each other is needed. This contribution presents a sy...
In a musical performance, musician, instrument and room form a closed feedback loop that continuously shapes the generated sound. A virtual acoustic environment was developed to study performance adjustments systematically. In formal experiments, 11 trumpet players were recorded while performing several pieces with multiple auralized versions of re...
A basic building block of audio for Augmented Reality (AR) is the use of virtual sound sources layered on top of real sources present in an environment. In order to perceive these virtual sources as belonging to the natural scene it is important to carefully replicate the room acous-tics of the listening space. However, it is unclear to what extent...
Reverberation plays a fundamental role in the auralisation of enclosed spaces as it contributes to the realism and immersiveness of virtual 3D sound scenes. However, rigorous simulation of interactive room acoustics is computationally expensive, and it is common practice to use simplified models at the cost of accuracy. In the present study, two su...
Sound propagation in an enclosed space is a combination of several wave phenomena, such as direct sound, specular reflections, scattering, diffraction, or air absorption, among others. Achieving realistic and immersive audio in games and virtual reality (VR) requires real-time modeling of these phenomena. Given that it is not clear which of the sou...
Historic voice recordings suffer from numerous artefacts such as bandwidth
limitations, noise and distortions. Apart from these unwanted effects
voice signals also exhibit changes in the timbre due to interaction
of the the uneven transfer characteristics with voice formants. Such
modifications are responsible for the characteristic timbre of histo...
Room acoustic conditions are an inherent element of every live music performance. They interact with the sound that is generated by the musicians, modifying the characteristics of the sound received by audience and musicians. While listeners usually play a passive role in the context of a live performance, musicians are part of a feedback loop comp...
Acoustical conditions on stage have an influence on the sound and character of a musical performance. Musicians constantly adjust their playing to accommodate to the stage acoustics. The study of acoustical preferences of musicians is part of the characterization of this feedback loop, which impacts on the musicians' comfort as well as on the aural...
A measurement set-up replicating a trumpet solo concert situation on stage is arranged by means of a music stand, a directive loudspeaker, and a microphone array. Spatial Room Impulse Responses are measured and analyzed to evaluate the acoustic impact of the music stand at the musician’s position, depending on the stand location and orientation. Re...
A compact tetrahedral microphone array is used to measure several controlled sound fields and compare the analysis of spatial room impulse responses with two methods: spatial decomposition method (SDM) and IRIS, a commercial system based on sound intensity vector analysis. Results suggest that the spatial accuracy of both methods is similar and in...
Concert hall acoustics have been traditionally evaluated by means of room acoustic measurements and perceptual studies, which requires an available concert hall, orchestra, and audience. This article presents a physical and perceptual comparison of room acoustics between arrays of real loudspeakers and virtual loudspeakers implemented with Wave Fie...
The sound of a musical instrument differs significantly between the musician’s position and a listener in the audience. While church organ players are used to imagining the sound of their instrument in a distance due to the widely spread stocks, orchestra musicians usually have only limited experience with perceiving their own instrument at various...
Previous studies on musicians' adjustments to room acoustics have demonstrated an influence of room acoustics on live solo music performance. Musicians adjust different aspects of the performance, such as tempo, articulation, dynamics or level. However, this effect seems to be highly dependent on individual musicians, musical pieces and instruments...
Studying the influence of room acoustics on musicians requires the possibility of analyzing a live performance in different acoustic conditions. A virtual acoustic environment provides modifiable room acoustic conditions without the necessity of moving into different rooms, thus removing non-acoustic cues and external factors that can influence the...
http://www.amise.netzwerk-musikhochschulen.de/
How does the sound of a music instrument change at diferent locations of the player and the listener? How can I optimise that sound as a musician or as an acoustician? This contribution presents an interactive web
concept that allows to assess the effect of several performance parameters on the sound...
This article presents the process of modellization of an acoustic environment suitable for the study of the effect of room acoustics on organ playing. A first stage of the process was already presented in a previous paper and the present work includes the implementation of a controllable delay between the interaction of the musician with the instru...
A pilot study on the influence of different reverberation on the musical performance of organ players is presented. Using an organ with MIDI output, three different organ players are recorded performing the same pieces while a room acoustics enhancement system is used to modify the acoustic conditions of the Detmold concert hall in real time. Since...
This thesis deals with the implementation and analysis of different methods to generate a first-order directional microphone for source location and noise measurement purposes. The methods analysed include a cardioid capsule, and virtual methods such the combination of the signals of two omnidirectional microphones, the combination of the signals o...