Article

Efficient models for reverberation and distance rendering in computer music and virtual audio reality

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This paper discusses efficient digital signal processing algorithms for real-time synthesis of dynamically controllable, natural-sounding artificial reverberation. A general modular framework is proposed for configuring a spatial sound processing and mixing system according to the reproduction format or setup and the listening conditions, over loudspeakers or headphones. In conclusion, the implementation and applications of a spatial sound processing software are described, and approaches to control interface design and effective distance effects are reviewed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Reverberation engine designs have been proposed wherein a per-source early reflections generator feeds into a late reverberation processor that can be shared between multiple sound sources located in the same room as the listener [73], [74]. Here, as shown in Fig. 11, the reflections generator is shared too (clustered reflections module, Section III-D) and the late reverberation is rendered by a reverb module which receives a separate mix of the source signals. ...
... Parametric reverberation algorithms suitable for this purpose have been extensively studied, including methods for analyzing, simulating or "sculpting" recorded, simulated or calculated reverberation responses [73]- [79]. Many involve a recirculating delay network as illustrated in Fig. 16 -where, referring to the reverberation API properties introduced in Section II-A2 [73], [75], [80], ...
... Parametric reverberation algorithms suitable for this purpose have been extensively studied, including methods for analyzing, simulating or "sculpting" recorded, simulated or calculated reverberation responses [73]- [79]. Many involve a recirculating delay network as illustrated in Fig. 16 -where, referring to the reverberation API properties introduced in Section II-A2 [73], [75], [80], ...
Conference Paper
Interactive audio spatialization technology previously developed for video game authoring and rendering has evolved into an essential component of platforms enabling shared immersive virtual experiences for future co-presence, remote collaboration and entertainment applications. New wearable virtual and augmented reality displays employ real-time binaural audio computing engines rendering multiple digital objects and supporting the free navigation of networked participants or their avatars through a juxtaposition of environments, real and virtual, often referred to as the Metaverse. These applications require a parametric audio scene programming interface to facilitate the creation and deployment of shared, dynamic and realistic virtual 3D worlds on mobile computing platforms and remote servers. We propose a practical approach for designing parametric 6-degree-of-freedom object-based interactive audio engines to deliver the perceptually relevant binaural cues necessary for audio/visual and virtual/real congruence in Metaverse experiences. We address the effects of room reverberation, acoustic reflectors, and obstacles in both the virtual and real environments, and discuss how such effects may be driven by combinations of pre-computed and real-time acoustic propagation solvers. We envision an open scene description model distilled to facilitate the development of interoperable applications distributed across multiple platforms, where each audio object represents, to the user, a natural sound source having controllable distance, size, orientation, and acoustic radiation properties.
... The choice of the feedback matrix is crucial for the FDN algorithm to work correctly. The popular matrix types used in FDN implementations that fulfill the requirement of being unilossless are Hadamard [27], Householder [27], random orthogonal, and identity matrices [28]. Where the first three are chosen to enhance specific properties of the algorithm, e.g., density of the impulse response, the identity matrix, however, reduces the FDN to a Schroeder reverberator, or a parallel set of comb filters [6,28]. ...
... The choice of the feedback matrix is crucial for the FDN algorithm to work correctly. The popular matrix types used in FDN implementations that fulfill the requirement of being unilossless are Hadamard [27], Householder [27], random orthogonal, and identity matrices [28]. Where the first three are chosen to enhance specific properties of the algorithm, e.g., density of the impulse response, the identity matrix, however, reduces the FDN to a Schroeder reverberator, or a parallel set of comb filters [6,28]. ...
... where u T N = [1, . . . , 1], and IN is the identity matrix [27]. The matrix of order 16, on the other hand, following [29], is constructed using the recursive embedding of matrix of order 4: ...
Conference Paper
Full-text available
Reverberation is one of the most important effects used in audio production. Although nowadays numerous real-time implementations of artificial reverberation algorithms are available, many of them depend on a database of recorded or pre-synthesized room impulse responses, which are convolved with the input signal. Implementations that use an algorithmic approach are more flexible but do not let the users have full control over the produced sound, allowing only a few selected parameters to be altered. The real-time implementation of an artificial reverberation synthesizer presented in this study introduces an audio plugin based on a feedback delay network (FDN), which lets the user have full and detailed insight into the produced reverb. It allows for control of reverberation time in ten octave bands, simultaneously allowing adjusting the feedback matrix type and delay-line lengths. The proposed plugin explores various FDN setups, showing that the lowest useful order for high-quality sound is 16, and that in the case of a Householder matrix the implementation strongly affects the resulting reverberation. Experimenting with delay lengths and distribution demonstrates that choosing too wide or too narrow a length range is disadvantageous to the synthesized sound quality. The study also discusses CPU usage for different FDN orders and plugin states.
... Delay Networks (FDNs) to first design the system to be lossless, i.e., all the system poles of the FDN lie on the unit circle [1] which corresponds to non-decaying and non-increasing eigenmodes [161]. This setup ensures a smooth frequency-dependent pole magnitude when every delay element is extended with a frequency-dependent attenuation that is proportional to the delay lengths [72]. ...
... Various unitary feedback matrices have been proposed to enhance specific properties of the FDN: computational efficiency (circulant [1], sparse [171], Householder [161]), dense impulse response (Hadamard [161], circulant based on Gallois sequences [172]), fitness to measured impulse response with a genetic algorithm [95], spectral flatness [173] and approximating a geometric model [94,174]. • block triangular concatenation: By Theorem 8, two unilossless matrices E and F may be concatenated into a block triangular matrix E G 0 F where G is an arbitrary matrix. ...
... Various unitary feedback matrices have been proposed to enhance specific properties of the FDN: computational efficiency (circulant [1], sparse [171], Householder [161]), dense impulse response (Hadamard [161], circulant based on Gallois sequences [172]), fitness to measured impulse response with a genetic algorithm [95], spectral flatness [173] and approximating a geometric model [94,174]. • block triangular concatenation: By Theorem 8, two unilossless matrices E and F may be concatenated into a block triangular matrix E G 0 F where G is an arbitrary matrix. ...
Thesis
Full-text available
In today's audio production and reproduction as well as in music performance practices it has become common practice to alter reverberation artificially through electronics or electro-acoustics. For music productions, radio plays, and movie soundtracks, the sound is often captured in small studio spaces with little to no reverberation to save real estate and to ensure a controlled environment such that the artistically intended spatial impression can be added during post-production. Spatial sound reproduction systems require flexible adjustment of artificial reverberation to the diffuse sound portion to help the reconstruction of the spatial impression. Many modern performance spaces are multi-purpose, and the reverberation needs to be adjustable to the desired performance style. Employing electro-acoustic feedback, also known as Reverberation Enhancement Systems (RESs), it is possible to extend the physical to the desired reverberation. These examples demonstrate a wide range of applications where reverberation is created and enhanced artificially employing signal processing techniques. A major challenge of designing artificial reverberators is the high complexity of the physical reverberation process. Even small office spaces of 40 m^3 exhibit more than 10^7 acoustic modes, in concert halls the number of acoustic modes can surpass 10^9 in the audible range. The room geometry, as well as the interaction with the boundary materials, can be as well fairly complex. Whereas these complex considerations are mandatory for simulations of specific spaces, used for example for the acoustic and architectural planning of a concert venue, they are somewhat misleading in the realm of artistic applications. The focus on perceptually convincing artificial reverberation algorithms provides the freedom to make some simplifications to the generation process, leading to the recursive systems, which play a central role in this dissertation. Two specific formulations of recursive systems for artificial reverberation are considered: Firstly, Feedback Delay Networks (FDNs) which are built around multiple delays which are fed back to their inputs and by this mimic the recursive process of sound waves bouncing back and forth in an acoustic space. And secondly, RESs, which are installed in rooms to extend the physical reverberation via electro-acoustic feedback between microphones and loudspeakers. The main objective of artificial reverberators is to recreate and enhance room impulse responses while considering three aspects: i) accurate recreation of physical spaces; ii) delivering perceptually convincing spaces; and iii) efficiency of processing and parameterization. The primary goal of this dissertation is to achieve better control over the evolution of the artificial reverberation over time, namely the evolution of normal modes and reflections over time. The decay rate of normal modes most importantly determines the stability of the system, but also the perceptual quality of the artificial reverberation. For this purpose, existing network topologies for artificial reverberation are unified in the general FDN framework. For the FDN, an analytic formulation of the polynomial governing the recursive behavior is presented from which analytic constraints on the angular distribution of the decaying modes are derived. Lossless FDNs are commonly used as a design prototype for artificial reverberation algorithms for which all normal modes neither decay nor rise. The lossless property is dependent on the feedback matrix, which connects the output of a set of delays to their inputs, and the lengths of the delays. This work presents the most general class of feedback matrices which constitutes lossless FDNs regardless the lengths of the delays. As a secondary goal, the temporal features of impulse responses produced by FDNs, i.e., the number of echoes per time interval and its evolution over time, are analyzed. This so-called echo density is related to known measures of mixing time and their psychoacoustic correlates such as perception of the room size. It is shown that the echo density of FDNs follows a polynomial function, whereby the polynomial coefficients can be derived from the lengths of the delays for which an explicit method is given. The mixing time of impulse responses can be predicted from the echo density, and conversely, the desired mixing time can be achieved by a derived mean delay length. In the last part of this dissertation, a novel time-variant reverberation algorithm is introduced. By modulating the feedback matrix nearly continuously over time, an intricate pattern of concurrent amplitude modulations of the feedback paths evolves. It is demonstrated that the perceived quality of the decaying normal modes can be enhanced by the feedback matrix modulation. The same technique of time-varying feedback matrices is applied in multichannel sound systems to improve the system's stability. It is shown with a statistical approach that time-varying mixing matrices can achieve optimal stability improvement for a higher number of channels. A listening test demonstrates the improved quality of time-varying mixing matrices over comparable existing techniques.
... The decay rate of the eigenmodes is in turn proportional to the magnitude of the system poles. It became common practice for FDNs to first design the system to be lossless, i.e., all the system poles of the FDN lie on the unit circle [10] which corresponds to non-decaying and non-increasing eigenmodes [14]. This ensures a smooth frequency-dependent pole magnitude when every delay element is extended with a frequencydependent attenuation that is proportional to the delay lengths [1] and further developed by Jot and Chaigne [15]. ...
... the result follows from (14). The preservation of the Euclidean norm employed in (14) is a defining property of unitary matrices such that they are also called energy conserving. ...
... Various unitary feedback matrices have been proposed to enhance specific properties of the FDN: computational efficiency (circulant [10], sparse [31], Householder [14]), dense impulse response (Hadamard [14], circulant based on Gallois sequences [32]), fitness to measured impulse response with a genetic algorithm [33], spectral flatness [34] and approximating a geometric model [35], [36]. All those examples of feedback matrices are unitary and therefore by Lemma 1 unilossless. ...
Article
Lossless Feedback Delay Networks (FDNs) are commonly used as a design prototype for artificial reverberation algorithms. The lossless property is dependent on the feedback matrix, which connects the output of a set of delays to their inputs, and the lengths of the delays. Both, unitary and triangular feedback matrices are known to constitute lossless FDNs, however, the most general class of lossless feedback matrices has not been identified. In this contribution, it is shown that the FDN is lossless for any set of delays, if all irreducible components of the feedback matrix are diagonally similar to a unitary matrix. The necessity of the generalized class of feedback matrices is demonstrated by examples of FDN designs proposed in literature.
... Optionally, the set of calculated low-level processing parameters values may be further corrected by a context compensation operator (Fig. 7) which automatically performs, in the time-frequency-energy domain, a deconvolution accounting for the acoustic response of a loudspeaker reproduction environment [63], [64]. This compensation is based on principles similar to those underlying the object-based "room-in-room" acoustic response correction method developed in [65], [66]. ...
... In addition to room acoustical descriptors, the highlevel user interface Spat OPer exposes controls for the localization and the radiation of the sound source, as shown in Fig. 7 [13], [61]. The internal topology of the core rendering functions, the Room and Panning modules, is shown in Fig. 9 [13], [64], [67]. An example of impulse response produced by the system is displayed in Fig. 8. ...
... For example, at time zero we may start from a mixture having 1 3 probability of |u and 2 3 probability of |d . The density matrix would evolve in time according to equation (26). ...
... A promising perspective for future quantum sound processing, is to find realizable quantum operators for such matrices. In particular, the Hadamard operator and the Householder reflection are extensively used in quantum algorithms, and these were proposed as reference matrices for feedback delay networks with maximallydiffusive properties [26]. In the context of QVTS, a Hadamard gate H converts a phon from |r to |u , and from |l to |d . ...
Chapter
Full-text available
Humans have a privileged, embodied way to explore the world of sounds, through vocal imitation. The Quantum Vocal Theory of Sounds (QVTS) starts from the assumption that any sound can be expressed and described as the evolution of a superposition of vocal states, i.e., phonation, turbulence, and supraglottal myoelastic vibrations. The postulates of quantum mechanics, with the notions of observable, measurement, and time evolution of state, provide a model that can be used for sound processing, in both directions of analysis and synthesis. QVTS can give a quantum-theoretic explanation to some auditory streaming phenomena, eventually leading to practical solutions of relevant sound-processing problems, or it can be creatively exploited to manipulate superpositions of sonic elements. Perhaps more importantly, QVTS may be a fertile ground to host a dialogue between physicists, computer scientists, musicians, and sound designers, possibly giving us unheard manifestations of human creativity.
... For example, at time zero we may start from a mixture having 1 3 probability of |u and 2 3 probability of |d . The density matrix would evolve in time according to equation (26). ...
... A promising perspective for future quantum sound processing, is to find realizable quantum operators for such matrices. In particular, the Hadamard operator and the Householder reflection are extensively used in quantum algorithms, and these were proposed as reference matrices for feedback delay networks with maximallydiffusive properties [26]. In the context of QVTS, a Hadamard gate H converts a phon from |r to |u , and from |l to |d . ...
Preprint
Full-text available
Humans have a privileged, embodied way to explore the world of sounds, through vocal imitation. The Quantum Vocal Theory of Sounds (QVTS) starts from the assumption that any sound can be expressed and described as the evolution of a superposition of vocal states, i.e., phonation, turbulence, and supraglottal myoelastic vibrations. The postulates of quantum mechanics, with the notions of observable, measurement, and time evolution of state, provide a model that can be used for sound processing, in both directions of analysis and synthesis. QVTS can give a quantum-theoretic explanation to some auditory streaming phenomena, eventually leading to practical solutions of relevant sound-processing problems, or it can be creatively exploited to manipulate superpositions of sonic elements. Perhaps more importantly, QVTS may be a fertile ground to host a dialogue between physicists, computer scientists, musicians, and sound designers, possibly giving us unheard manifestations of human creativity.
... To make the artificial reverberation sound perceptually plausible, i.e., logical and probable for the particular space, the energy decay must be frequency-dependent, which can be achieved by inserting attenuation filters into the algorithm. Over time, various types of such filters have been proposed, starting from a first-order low-pass infinite impulse response (IIR) filter [3], to biquadratic filters, which allowed to control the decay in few frequency bands [4], to high-order filter designs [5]. ...
... for each entry in a matrix of identical structure [4,12]: ...
Conference Paper
Full-text available
Artificial reverberation algorithms aim at reproducing the frequency-dependent decay of sound in a room that is perceived as plausible for a particular space. In this study, we evaluate a feedback delay network reverberator with a modified cascaded graphic equalizer as an attenuation filter in terms of accurate reproduction of measured impulse responses of three rooms with different decay characteristics. First, the late reverb is synthesized by the proposed method and mixed with the early reflections separated from the original signal. The synthesized and measured signals are compared in terms of their decay characteristics and reverberation time values. The experiment shows that the proposed reverberator design reproduces real impulse responses well, although the decay-rate error exceeds the just noticeable difference of 5% in many cases. Additionally , perceptual qualities of the synthesized sounds were assessed through a listening test. Four qualities were tested for three room impulse responses and three kinds of stimuli. The results show that for the qualities reverberance, clarity, and distance, on average 75-79% of participants noticed only a slight or no difference between the measured and synthetic reverbs. Similar results were obtained for the speech and signing voice stimuli and the reverberation of lecture room and concert hall.
... The direct sound and early reflections, which are important to determine a sound source location in space with precision [29], are simulated with the ISM. Diffusion and late reflections (or reverberation tail), which do not need the same precision, are simulated with a recursive filter structure called feedback delay network (FDN) [30]. The use of a FDN for reverberation modeling is justified within the framework of statistical reverberation theory, as it achieves an increasing overlap (or density) of the acoustic modes in frequency domain and reflections in the time domain [31]. ...
... As said before, the reverberation tail is usually considered diffuse and resembles a stochastic noise that is attenuated exponentially with time. An eight-line FDN [30] allows modeling this diffusion and reverberation tail behavior [24]. ...
Article
There is a growing interest in the development and the evaluation of real-time auditory virtual environments (AVE). The implementation of this type of simulation system in general purpose computers is a still a challenge, and there are few studies that evaluated the perceived quality of synthetized sounds of simulated acoustic scenes. To evoke in the listener a correct image of the modeling space, the system must be dynamic and interactive. That is, it must respond to the changes in the acoustic scenario produced by the listener movement, in a perceptually acceptable time and with an update rate that guarantees continuity in the reproduction of sound events. Hard real-time systems ensure that a given task runs within a given time interval, providing deterministic behavior for applications with time restrictions. In the current article, a computational model to implement binaural synthesis in a hard real-time AVE is presented and evaluated. The computer model was implemented in an open-source auralization system. Measurements and real-time simulations on a university classroom were carried out to perform a reverberation time parameters validation and a system performance evaluation. Also, measured and simulated binaural soundtracks (composed from anechoic stimuli) were compared in terms of three selected perceptual attributes for subjective evaluations of static positions. The results showed that real-time performance was acceptable according to values previously reported in the literature and that computer prediction errors for the measured parameters were within the subjective difference limens. The computational model was able to generate an AVE with an acceptable overall perceptual quality.
... The focus is the three-dimensional localisation of a sound source through perceptual distance cues that are generated in virtual environments such as teleconferencing and flight simulators by means of a multichannel electronic reverberator [SBP04], [SP82], [Jot97]. The aim of the project is to devise a stereophonic reverberator in order to manipulate the distance of a static sound source perceived by a static listener. ...
... However, spring-line reverberation units and tape-delay analogue techniques persisted in the 1970s. Analogue electronic delay lines were not widely available until the 1980s; and towards the 1990s, sophisticated digital signal processors became widespread incorporating algorithms with functions such as early reflections shaping [Jot97], room size simulation, enhancement of the impression of spaciousness [KB92] and increment of lateral reflections. The traditional approach to real-time synthetic reverberation is based on delay networks combining feedforward paths to render early reflections and feedback paths to synthesize the late reverberation [Gar98]. ...
... In cases where the room simulation can be simplified to the extent that directly processing the signal is faster than convolution with an impulse response, this type of design is more efficient. The most suitable geometric acoustics simulation methods to support direct input processing is the hybrid approach[30,31,32,33,34,35,36,37,38]. This hybrid approach uses one structure for simulating early reflections and another separate system for doing the late reverberation. ...
... This gives more precision to the modeling of the shortest distance reflections, thereby increasing precision in the earliest part of the impulse response.Fig. 4shows the structure of the proposed system, which consisted of an FDN and tapped delay lines similar to the design previously published in[47,34]. The difference is that our system produces both early and late reflections using the same FDN unit. ...
Technical Report
Full-text available
In human auditory perception of space, the early part of the reverberation impulse response is more perceptually relevant than the later part. This observation has inspired many efficient hybrid acoustic modeling approaches where the early reflections are modeled in detail and late reflections are generated by efficient structures that produce a rough approximation. Many existing methods simplify the computation by using a late reverb unit that doesn't vary its energy level according to a physical model. This results in an incorrect balance of energy between the early reflections and late reverb. In this technical report we show how the late reverb energy can be estimated during the processing of the early reflections model. We apply that method in geometrical modeling method that uses the Acoustic Rendering Equation [1] to produce a binaural acoustic simulation. We use a single Feedback Delay Network that simultaneously produces both precise early reflections and approximate late reverb. With the addition of a delay line with a small number of taps, we achieve a correct balance of early and late energy. This report also clarifies key concepts related to the use of the Acoustic Rendering Equation (ARE) and associates all the quantities in the model to physical units of measurement.
... The later soundfield also provides cues to source distance, with the DRR playing a significant role [21], as well as providing further cues to room size. (b) Spat RIR model (Based on [22], Fig. 5) Fig. 1: RIR models; (a) direct sound arriving time T 0 , early reflections beginning after the initial time gap, and late reverberation , showing sound becoming increasingly diffuse with time; (b) direct sound (R 0 ), discrete early reflections (R 1 ), diffuse early reflections (R 2 ), and late reverberation (R 3 ). Figure 1 shows two generic reverberation models representing the development of the room impulse response (RIR) over time. ...
... The (high-level) perceptual parameters map to low-level delay network coefficients to control various portions of the RIR. The mapping in MPEG-4 is based on the model described in [22] . In the synthesis , the R 0 portion (cf. ...
Conference Paper
Full-text available
Object-based audio is gaining momentum as a means for future audio productions to be format-agnostic and interactive. Recent standardization developments make recommendations for object formats, however the capture, production and reproduction of reverberation is an open issue. In this paper, we review approaches for recording, transmitting and rendering reverberation over a 3D spatial audio system. Techniques include channel-based approaches where room signals intended for a specific reproduction layout are transmitted, and synthetic reverberators where the room effect is constructed at the renderer. We consider how each approach translates into an object-based context considering the end-to-end production chain of capture, representation, editing, and rendering. We discuss some application examples to highlight the implications of the various approaches.
... The Schroeder Reverberator is a delay unit that uses a combination of comb and all-pass filters to produce a room response that more closely resembles the random nature of a physical reverberant environment. Since then, several more complicated, delay-based reverb implementations have been suggested [9] [18] [19] that combine feed-forward paths to represent early reflections and feedback paths to synthesize the later reverberation. ...
... For example, a typical reverberation unit often contains a wet/dry post-mix option which has no direct correlate in objective reality. From the first delay-based reverberator [17] to the most complex artificial reverb units (e.g., Jot's Spatializer [9]), the essential goal of these units is not the room but the sound output. ...
Conference Paper
Full-text available
This paper describes an application that provides real-time high accuracy room acoustics simulation. Using a multi-touch interface, the user can easily manipulate the dimensions of a virtual space while hearing the room’s acoustics change in real-time. Such an interface enables a more fluid and intuitive control of our application, which better lends itself to expressive artistic gestures for use in such activities as sound design, performance, and education. The system relies on high accuracy room impulse responses from CATT-Acoustic and real-time audio processing through Max/MSP and provides holistic control of a spatial environment rather than applying generic reverberation via individual acoustic parameters (i.e. early reflections, RT60, etc.). Such an interface has the capability to create a more realistic effect without compromising flexibility of use.
... The system has been implemented within ListenSpace ( [2], [3]), a prototyping control and authoring environment, developed at IRCAM and that communicates with the Spatialisateur ( [5]) for sound localization and room effect rendering. This implementation takes the advantage of metadata such as individual loudness of the sound sources, or spectral information, either pre-calculated or, when possible, evaluated on the fly. ...
... Control interfaces for sound spatialization have been the purpose of a number of researches. The Ircam Spatialisateur ( [5]) for instance was the opportunity to focus on the perceptual aspects of the sound scene and provide the "SpatOper", a control interface made of classical interaction components (typically sliders), but presenting the sound scene as a set of perceptual acoustical properties instead of low level digital signal processing parameters. Other works such as Move in Space ( [9]) or Holophon ( [8]) for instance focused more on the temporal aspects of sound spatialization by making use of the notion of trajectory in real time. ...
Conference Paper
This work addresses the general problem of designing graphical user interfaces for non expert users. The key idea is to help the user anticipating his actions by displaying, in the interaction area, the expected evolution of a quality criterion according to the degrees of freedom which are being monitored. This concept is first applied to the control of sound spatialization: various perceptually based criteria such as "spatial homogeneity" or "spatial masking" are represented as a grey shaded map superimposed to the background of a bird's eye view interface. After selecting a given sound source, the user is thus informed how these criteria will behave if the source is being moved to any other location of the virtual sound scene.
... In the research field of robotics, the accuracy of sound source detection is substantially improved by the use of reflected sounds [32]. Reflected sounds are also used to produce music sounds naturally well known as the effect of reverberation [33], [34]. Thus, analyzing the effects of reflected sounds is important to reproduce a sound field space virtually that more closely and comfortably resembles reality. ...
Article
Full-text available
This study develops a 3D sound localization platform specialized in evaluating the effect of synthetic reflected sounds. 3D sound localization is one of the promising technologies to realize a more realistic digital-twin world, and reflected sounds are indispensable to pursue further reality. Although reflected sounds are well-known they strongly affect sound localization contributing to the realistic 3D sound field, it is still unclear why and how the reflected sounds affect perceptions of sound direction since there has been no such platform to evaluate the effect of reflected sounds on perceptions of sound direction. Our proposed platform implements imaginary sound sources by taking into account arrival time differences as reflected sounds from reflective walls using Head-Related Transfer Functions (HTRFs), and they are mixed with direct sounds to evaluate the effect of reflected sounds on perceptions of the direction of sound sources. The experiments are conducted to prove the usefulness of the proposed platform evaluating parameters of sound reflections affecting perceptions of the direction of sound sources. The experimental results show the influences of reflected sounds varied according to the direction of reflected sounds, and our platform is expected in the future to demystify the various effects of reflected sounds on perceptions of the direction of sound sources.
... -Exposing a sound-designer friendly high-level control interface for the fine tuning of internal low-level parameters such as m 0 , m 1 and g 0 defined in Section 3. (Note that, as visible in Figures 3 and 9, the application of the attenuation factor g 0 to direct-path components is necessary for loudness preservation when externalization processing is applied by enabling the "diffuse tail processing" function described here.) -The use of Velvet Noise decorrelators [36] or alternative IIR network designs such as nested all-pass filters and feedback delay networks [37,38] or time-varying all-pass networks [32]. -Diffuse tail processing algorithm designs that preserve mono compatibility or employ alternative approaches to control low-frequency interaural coherence in order to match the natural properties of diffuse sound fields (see Fig. 10 and [34]). ...
Article
In both entertainment and professional applications, conventionally produced stereo or multi-channel audio content is frequently delivered over headphones or earbuds. Use cases involving object-based binaural audio rendering include recently developed immersive multi-channel audio distribution formats, along with the accelerating deployment of virtual or augmented reality applications and head-mounted displays. The appreciation of these listening experiences by end users may be compromised by an unnatural perception of the localization of frontal audio objects: commonly heard near or inside the listener’s head even when their specified position is distant. This artifact may persist despite the provision of perceptual cues that have been known to partially mitigate it, including artificial acoustic reflections or reverberation, head-tracking, individualized HRTF processing, or reinforcing visual information. In this paper, we review previously reported methods for binaural audio externalization processing, and generalize a recently proposed approach to address object-based audio rendering.
... The traditional approach to synthesizing reverberation is based on delay networks combining feedforward paths to render early reflections and feedback paths to synthesize late reverberation [32]. One of the earliest and most famous reverberation algorithms was presented by Schroeder in 1961 [5]; it has a simple structure consisting of four parallel comb filters and two series all-pass filters, as shown in Figure 1 [27]. ...
Article
Full-text available
This paper presents a study evaluating the perceptual similarity between artificial reverberation algorithms and acoustic measurements. An online headphone-based listening test was conducted and data were collected from 20 expert assessors. Seven reverberation algorithms were tested in the listening test, including the Dattorro, Directional Feedback Delay Network (DFDN), Feedback Delay Network (FDN), Gardner, Moorer, and Schroeder reverberation algorithms. A new Hybrid Moorer–Schroeder (HMS) reverberation algorithm was included as well. A solo cello piece, male speech, female singing, and a drumbeat were rendered with the seven reverberation algorithms in three different reverberation times (0.266 s, 0.95 s and 2.34 s) as the test conditions. The test was conducted online and based on the Multiple Stimuli with Hidden Reference and Anchor (MUSHRA) paradigm. The reference conditions consisted of the same audio samples convolved with measured binaural room impulse responses (BRIRs) with the same three reverberation times. The anchor was dual-mono 3.5 kHz low pass filtered audio. The similarity between the test audio and the reference audio was scored on a scale of zero to a hundred. Statistical analysis of the results shows that the Gardner and HMS reverberation algorithms are good candidates for exploration of artificial reverberation in Augmented Reality (AR) scenarios in future research.
... -Exposing a sound-designer friendly high-level control interface for the fine tuning of internal low-level parameters such as m 0 , m 1 and g 0 defined in Section 3. (Note that, as visible in Figures 3 and 9, the application of the attenuation factor g 0 to direct-path components is necessary for loudness preservation when externalization processing is applied by enabling the "diffuse tail processing" function described here.) -The use of Velvet Noise decorrelators [36] or alternative IIR network designs such as nested all-pass filters and feedback delay networks [37,38] or time-varying all-pass networks [32]. -Diffuse tail processing algorithm designs that preserve mono compatibility or employ alternative approaches to control low-frequency interaural coherence in order to match the natural properties of diffuse sound fields (see Fig. 10 and [34]). ...
Article
In both entertainment and professional applications, conventionally produced stereo or multi-channel audio content is frequently delivered over headphones or earbuds. Use cases involving object-based binaural audio rendering include recently developed immersive multi-channel audio distribution formats, along with the accelerating deployment of virtual or augmented reality applications and head-mounted displays. The appreciation of these listening experiences by end users may be compromised by an unnatural perception of the localization of frontal audio objects: commonly heard near or inside the listener's head even when their specified position is distant. This artifact may persist despite the provision of perceptual cues that have been known to partially mitigate it, including artificial acoustic reflections or reverberation, head-tracking, individualized HRTF processing, or reinforcing visual information. In this paper, we review previously reported methods for binaural audio externalization processing, and generalize a recently proposed approach to address object-based audio rendering.
... Reverberation may also be generated through computationally lighter 'convolution-less' methods, such as Schroeder reverberators [36] or feedback delay networks (FDN) [37][38][39]. Such techniques are generally less accurate than convolution-based methods but can be useful to efficiently model the less critical parts of the RIR such as the late-reverberation tail [40]. ...
Chapter
Full-text available
Accurately rendering reverberation is critical to produce realistic binaural audio, particularly in augmented reality applications where virtual objects must blend in seamlessly with real ones. However, rigorously simulating sound waves interacting with the auralised space can be computationally costly, sometimes to the point of being unfeasible in real time applications on resource-limited mobile platforms. Luckily, knowledge of auditory perception can be leveraged to make computational savings without compromising quality. This chapter reviews different approaches and methods for rendering binaural reverberation efficiently, focusing specifically on Ambisonics-based techniques aimed at reducing the spatial resolution of late reverberation components. Potential future research directions in this area are also discussed.
... Based on the aforementioned research, various reverb-rendering methods which try to achieve high fidelity at reasonable costs have been proposed throughout the years. According to a comprehensive review by Valimaki et al. (2012), these methods can be generally classified in three categories: delay networks, such as feedback delay networks (Jot and Chaigne, 1991;Jot, 1997) or Schroeder reverberators (Schroeder and Logan, 1961); convolution algorithms in which a dry input signal is convolved with an omnidirectional or Ambisonics RIR; and computational acoustics, which encompass geometry-based simulations, such as the image source method (Allen and Berkley, 1979), and wave-based methods, such as the finite-difference time-domain method (Botteldooren, 1995). In practice, these categories overlap; for instance, an RIR used for convolution may be generated through computational acoustics (Pelzer et al., 2014;Schissler, Stirling, et al., 2017). ...
Thesis
Full-text available
Audio Augmented Reality (AAR) is defined as the extension of a real auditory environment through virtual sound sources. A successful AAR system should create the illusion that virtual sounds actually come from the user's environment, for which several technical challenges must be overcome. First, room acoustics must be simulated accurately to predict the reverberant sound field produced by the virtual source as sound wavefronts reach the user. Second, said sound field must be translated into a pair of sound pressure signals at the user's ears. Finally, this binaural signal must be delivered to the user through an acoustically transparent system without limiting their ability to hear real sources. This process should be able to adapt in real time to user movements in a computationally efficient way, considering that resources may be limited in practice and most of them will likely be allocated to graphics processing (e.g. in a pair of augmented reality glasses). This Thesis aims to improve current techniques for binaural audio rendering in AAR by exploring the trade-off between computational complexity and perceived quality. Several perception-focused studies were proposed to explore the different parts of the rendering process. First, a prototype AAR system with hear-through functionality was proposed and a pilot experiment was conducted to investigate how users could adapt to it over time. A second study assessed the effect of non-individualised equalisation on the perceived quality of binaural renderings reproduced with open-ear headphones. A third study evaluated several state-of-the-art methods for the binaural rendering of sound fields of limited resolution in the spherical harmonics (Ambisonics) domain. Finally, a fourth study assessed the perceptual effect of simplifying Ambisonics-based binaural reverberation in various ways. Even though this Thesis focuses on the AAR scenario, the findings herein may be helpful for any application that would benefit from a computationally efficient implementation of binaural audio rendering methods.
... In [208,209], Jot et al. cascaded early reflections and late reverberation in three stages. Starting from early reflections, which are panned either left or right, each reflection also serves as the input to the individual delay lines of a delay network. ...
Thesis
Full-text available
Available online with the related articles at: http://urn.fi/URN:ISBN:978-952-64-0472-1 In this dissertation, the reproduction of reverberant sound fields containing directional characteristics is investigated. A complete framework for the objective and subjective analysis of directional reverberation is introduced, along with reverberation methods capable of producing frequency- and direction-dependent decay properties. Novel uses of velvet noise are also proposed for the decorrelation of audio signals as well as artificial reverberation. The methods detailed in this dissertation offer the means for the auralization of reverberant sound fields in real-time, with applications in the context of Immersive sound reproduction such as virtual and augmented reality.
... Historically, the first attempt to produce frequency-dependent reverberation was made by inserting a one-pole lowpass filter into a feedback structure [2], [3]. Later, controlling the decay rate in three independent frequency bands was possible by introducing biquadratic filters with adjustable crossover frequencies [32]. In [33], a 13th-order filter comprising single bandpass filters with a second-order Butterworth bandpass filter was proposed. ...
Article
Full-text available
This paper proposes a novel algorithm for simulating the late part of room reverberation. A well-known fact is that a room impulse response sounds similar to exponentially decaying filtered noise some time after the beginning. The algorithm proposed here employs several velvet-noise sequences in parallel and combines them so that their non-zero samples never occur at the same time. Each velvet-noise sequence is driven by the same input signal but is filtered with its own feedback filter which has the same delay-line length as the velvet-noise sequence. The resulting response is sparse and consists of filtered noise that decays approximately exponentially with a given frequency-dependent reverberation time profile. We show via a formal listening test that four interleaved branches are sufficient to produce a smooth high-quality response. The outputs of the branches connected in different combinations produce decorrelated output signals for multichannel reproduction. The proposed method is compared with a state-of-the-art delay-based reverberation method and its advantages are pointed out. The computational load of the method is 60% smaller than that of a comparable existing method, the feedback delay network. The proposed method is well suited to the synthesis of diffuse late reverberation in audio and music production.
... Based on the aforementioned research, various reverbrendering methods which try to achieve high fidelity at reasonable costs have been proposed throughout the years. According to a comprehensive review by Valimaki et al. (2012), these methods can be generally classified in three categories: delay networks, such as feedback delay networks (Jot, 1997;Jot and Chaigne, 1991) or Schroeder reverberators (Schroeder and Logan, 1961); convolution algorithms in which a dry input signal is convolved with an omnidirectional or Ambisonics RIR; and computational acoustics, which encompass geometrybased simulations, such as the image source method (Allen and Berkley, 1979) and wave-based methods, similar to the finite-difference time-domain method (Botteldooren, 1995). In practice, these categories overlap; for instance, an RIR used for convolution may be generated through computational acoustics (Pelzer et al., 2014;Schissler et al., 2017). ...
Article
Full-text available
Reverberation is essential for the realistic auralisation of enclosed spaces. However, it can be computationally expensive to render with high fidelity and, in practice, simplified models are typically used to lower costs while preserving perceived quality. Ambisonics-based methods may be employed to this purpose as they allow us to render a reverberant sound field more efficiently by limiting its spatial resolution. The present study explores the perceptual impact of two simplifications of Ambisonics-based binaural reverberation that aim to improve efficiency. First, a "hybrid Ambisonics" approach is proposed in which the direct sound path is generated by convolution with a spatially dense head related impulse response set, separately from reverberation. Second, the reverberant virtual loudspeaker method (RVL) is presented as a computationally efficient approach to dynamically render binaural reverberation for multiple sources with the potential limitation of inaccurately simulating listener's head rotations. Numerical and perceptual evaluations suggest that the perceived quality of hybrid Ambisonics auralisations of two measured rooms ceased to improve beyond the third order, which is a lower threshold than what was found by previous studies in which the direct sound path was not processed separately. Additionally, RVL is shown to produce auralisations with comparable perceived quality to Ambisonics renderings.
... Another approach is to introduce short delays in the feedback matrix, so that each matrix element consists of a gain and a delay [24]. Also, separate early reflection modules using finite impulse response (FIR) filters for FDNs have been suggested [25]. However, the magnitude spectrum of these filters should be designed to minimize undesirable spectral coloration. ...
Conference Paper
Full-text available
Artificial reverberation is an audio effect used to simulate the acoustics of a space while controlling its aesthetics, particularly on sounds recorded in a dry studio environment. Delay-based methods are a family of artificial reverberators using recirculating delay lines to create this effect. The feedback delay network is a popular delay-based reverberator providing a comprehensive framework for parametric reverberation by formalizing the recirculation of a set of interconnected delay lines. However, one known limitation of this algorithm is the initial slow build-up of echoes, which can sound unrealistic, and overcoming this problem often requires adding more delay lines to the network. In this paper, we study the effect of adding velvet-noise filters, which have random sparse coefficients, at the input and output branches of the reverberator. The goal is to increase the echo density while minimizing the spectral coloration. We compare different variations of velvet-noise filtering and show their benefits. We demonstrate that with velvet noise, the echo density of a conventional feedback delay network can be exceeded using half the number of delay lines and saving over 50% of computing operations in a practical configuration using low-order attenuation filters.
... Initially, a first-order lowpass infinite impulse response (IIR) filter was used because of its low computational cost and ease of design [5,8]. Later, biquadratic filters were introduced allowing to control the decay time in three independent frequency bands with adjustable crossover frequencies [9]. In [10], a 13th-order filter comprising single bandpass filters as described in [11] and a second-order Butterworth bandpass filter was proposed. ...
Conference Paper
Full-text available
Artificial reverberation algorithms generally imitate the frequency-dependent decay of sound in a room quite inaccurately. Previous research suggests that a 5% error in the reverberation time (T60) can be audible. In this work, we propose to use an accurate graphic equalizer as the attenuation filter in a Feedback Delay Network re-verberator. We use a modified octave graphic equalizer with a cascade structure and insert a high-shelf filter to control the gain at the high end of the audio range. One such equalizer is placed at the end of each delay line of the Feedback Delay Network. The gains of the equalizer are optimized using a new weighting function that acknowledges nonlinear error propagation from filter magnitude response to reverberation time values. Our experiments show that in real-world cases, the target T60 curve can be reproduced in a perceptually accurate manner at standard octave center frequencies. However, for an extreme test case in which the T60 varies dramatically between neighboring octave bands, the error still exceeds the limit of the just noticeable difference but is smaller than that obtained with previous methods. This work leads to more realistic artificial reverberation.
... The feedback matrix A needs to be unitary (or orthogonal) [Jot, 1997], meaning that when multiplied by its Hermitian transpose the result is the identity matrix [Gentle, 2007, p. 104]. ...
Thesis
Interactive rendering of spatial audio in computer games is a challenging task as computational resources are limited. Artificial reverberators provide a computationally efficient solution; however, they do not explicitly model physical parameters such as source and receiver positions. A recent method termed the scattering delay network (SDN) is a type of digital waveguide network that models the physical parameters of a room at a low computational cost, allowing real-time interactive auralisations. An experiment was carried out, comparing the perceived naturalness and pleasantness of reverberation generated with SDNs to that of reverberation generated with other artificial reverberation methods. For both naturalness and pleasantness, the differences in ratings for reverberation generated using the feedback delay network (FDN) method and reverberation generated using SDNs were statistically significant. The mean naturalness rating was 12% higher for SDN stimuli than FDN stimuli, and the mean pleasantness rating was 8% higher. Similarly, the ratings for reverberation generated by convolution with recorded binaural room impulse responses (BRIRs) were significantly different from ratings achieved by reverberation generated using SDNs. The SDN stimuli achieved a mean naturalness rating that was 6% higher and a mean pleasantness that was 5% higher than those generated using BRIRs. Speculative hypotheses as to why stimuli generated using FDNs and recorded BRIRs achieved lower ratings than those achieved with SDNs are presented, with suggestions for further research. It was also found that CATT-Acoustics models which had been simplified to a bare rectangular room received lower ratings than models that included furniture or irregular room shaping, suggesting that the scattering and mixing effects of irregularities cause improvements in perceived quality of reverberation.
... In the early days, the most cheaply available filter was a one-pole lowpass filter [6,14,15,16,17,18]. Biquadratic filters allow control of the decay time in three independent frequency bands, with adjustable crossover frequencies [19,20]. More advanced studies try to emulate the frequency response of reflection coefficients in octave bands by applying high-order filter IIR filters [9,10,21]. ...
Conference Paper
Full-text available
The reverberation time is one of the most prominent acoustical qualities of a physical room. Therefore, it is crucial that artifi- cial reverberation algorithms match a specified target reverberation time accurately. In feedback delay networks, a popular framework for modeling room acoustics, the reverberation time is determined by combining delay and attenuation filters such that the frequency- dependent attenuation response is proportional to the delay length and by this complying to a global attenuation-per-second. How- ever, only few details are available on the attenuation filter design as the approximation errors of the filter design are often regarded negligible. In this work, we demonstrate that the error of the filter approximation propagates in a non-linear fashion to the resulting reverberation time possibly causing large deviation from the speci- fied target. For the special case of a proportional graphic equalizer, we propose a non-linear least squares solution and demonstrate the improved accuracy with a Monte Carlo simulation.
... This can be accounted to the FDN's flexibility in design, computational efficiency, and independent control over energy decay, amount of diffusion and total energy in each frequency band [13]. Further work has been done to maximize the echo density of FDNs by using specific feedback matrices [18]- [20]. ...
Article
Feedback delay networks (FDNs) are frequently used to generate artificial reverberation. This contribution discusses the temporal features of impulse responses produced by FDNs, i.e., the number of echoes per time unit and its evolution over time. This so-called echo density is related to known measures of mixing time and their psychoacoustic correlates such as auditive perception of the room size. It is shown that the echo density of FDNs follows a polynomial function, whereby the polynomial coefficients can be derived from the lengths of the delays for which an explicit method is given. The mixing time of impulse responses can be predicted from the echo density, and conversely, a desired mixing time can be achieved by a derived mean delay length. A Monte Carlo simulation confirms the accuracy of the derived relation of mixing time and delay lengths.
... It is also used for enabling the computation of a reference distance, which is one of the fields associated with each perceptual sound object in MPEG-4. It informs the renderer about the distance at which the defined parameters are valid, and at other distances the parameters are modified through a distancedependent rolloff factor causing automatic update to the response (and the distance effect) [17,18]. Finally, this receiver object (being unique for each scene), contains the information about how the data held in the IR measurement source positions should be taken into account when the parameters are set for the the sound source objects. ...
... As above, the filter equation can be arrived at from its transfer Matrix multiplication is the next required process. Several matrices have been suggested, as discussed practically in[193,89]. A householder matrix: identity matrix, was found to perform well, so is chosen here (realised in the , and matrices at the start of the hrtfreverb opcode). ...
... We introduce reverberation using the Image Source Method, which consists of replicating the sound as virtual images resulting from the reflections of the sound waves on the walls [6]. More complex reverberation algorithms [39] are not necessary, since it has been shown that it is only the first reflections that are useful to help with the sound localisation process [4]. Likewise, sound localisation can be considerably improved by tracking the position of the head, and adjusting the sound sources accordingly. ...
Article
Full-text available
The Devices for Assisted Living (DALi) project is a research initiative sponsored by the European Commission under the FP7 programme aiming for the development of a robotic device to assist people with cognitive impairments in navigating complex environments. The project revisits the popular paradigm of the walker enriching it with sensing abilities (to perceive the environment), with cognitive abilities (to decide the best path across the space) and with mechanical, visual, acoustic and haptic guidance devices (to guide the person along the path). In this paper, we offer an overview of the developed system and describe in detail some of its most important technological aspects.
... Each particular choice of the feedback matrix has corresponding implications on subjective or objective qualities of the reverberator [32]; e.g. in the particular case of the identity matrix, A = I, the FDN structure reduces to N comb filters connected in parallel and acts as the Schroeder reverberator [3]. Note, however that unitary matrices are only a subset of possible lossless feedback matrices [10], [33]; we elaborate on this point in the next subsection. ...
Article
Full-text available
An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network (FDN), and its memory requirements are negligible.
... Celles-ci sont réalisées en application directe de travaux de recherche menés sur la spatialisation des sons. Le système Spat de l'Ircam, utilisé en production musicale et sonore, assure la simulation et le rendu de l'effet produit par des sources sonores, placées à des positions données dans une salle virtuelle dont les caractéristiques sonores peuvent être configurées à partir de paramètres pertinents d'un point de vue perceptif (Jot, 1997). Ainsi, l'interface développée, suivant la métaphore de l'orchestre, se présente sous la forme d'un espace bidimensionnel, dans lequel apparaissent les positions des différents instruments ou voies de polyphonie et de l'auditeur. ...
Article
Full-text available
Cet article propose une synthèse du projet européen SemanticHIFI, qui vise la préfiguration des chaînes hi-fi de demain, à travers la réalisation de fonctions inédites de gestion et de manipulation par le contenu des enregistrements musicaux. Les limites des équipements existants sont liées à celles des formats de diffusion de la musique, qui, se présentant depuis plusieurs décennies sous la forme de signaux d’enregistrements stéréophoniques, n’autorisent que des modes de manipulation élémentaires. L’extension des supports d’informations musicales à des représentations plus riches, issues soit directement de processus de production renouvelés ou d’outils d’indexation personnalisés, rend possible la réalisation de fonctions innovantes : classification personnalisée, navigation par le contenu, spatialisation sonore, composition, partage sur les réseaux préservant les droits liés aux œuvres, etc. Ces fonctions sont le résultat d’activités de recherche menées dans le cadre du projet et se situant à la pointe de plusieurs disciplines : analyse et traitement des signaux audionumériques, ingénierie des connaissances musicales, interfaces homme-machine, architectures de réseaux distribuées. Le projet prévoit également une phase d’intégration, visant la réalisation de prototypes d’applications, permettant de valider l’ensemble de ces fonctions dans un environnement technique unifié et compatible avec les contraintes du marché de l’électronique grand public. Après l’exposé du contexte de réalisation du projet, de ses principaux objectifs et de la problématique de description et d’extraction automatisée des contenus musicaux, transversale à l’ensemble de ses travaux, une présentation des principales fonctions développées est proposée (navigation inter- et intra-documents, jeu, composition et partage sur les réseaux), avant d’en préciser les modalités d’intégration technique sous la forme d’applications prototypes.
... Although research on spatialized sound rendering has a relative long history in the computer music and sound processing community [3,4,1], little research has investigated the effect of spatialized sound when included in a virtual reality environment. An exception is the work presented in [5], where two experiments were proposed in order to investigate potential benefits of high quality auditory rendering in virtual reality. ...
Conference Paper
Full-text available
We describe an experiment whose goal is to investigate the usage of different audio rendering techniques delivered through headphones while walking inside a wide four-side CAVE environment. In our experiment, participants had to physically walked along a virtual path exposed to different auditory stimuli. Each subject was ex-posed to three conditions: Stereo, Binaural sound spatially con-gruent with visual and binaural sound spatially incongruent with visuals and had to rate subjectively each. The results of the exper-iment showed increased preference ratings for the binaural audio rendering, followed by stereo rendering. As expected incongruent spatial cues were ranked significantly lower. Binaural rendering can deliver an increased immersive experience and do no require specialized hardware.
... Le guidage peut consister en une mesure objective d'un facteur perceptif, comme ici l'eet de hors-phase (ce qui déplace le son sur le côté et crée une impression désagréable de son à la fois dans et hors de la tête), mais également en une meilleure lisibilité des paramètres que l'on peut faire varier. C'est là le principe du Spat_Oper, une interface de commande du SPAT ( [18], [19]), un programme pour MAX/MSP développé par l'IRCAM et qui permet le traitement de sources sonores an de les spatialiser. Il combine des outils de calcul de réverbération avec des moteurs de rendu permettant d'écouter ces sources sur diérents systèmes de restitution.. Un simple trait vertical représenterait un son monophonique. ...
... This is clearly unmanageable therefore a reverberation algorithm is used. We used a Feedback Delay Network (FDN) (Jot 1997) based reverberation that is controlled by a set of perceptual parameters set in the sound source object. This solution is used in IRCAM's Spatialisateur (Spat). ...
Article
This paper presents an ongoing research project taking place at the University of Wollongong which aims to develop a hardware and software framework for the creation, manipulation and rendering of complex 3D sound environments described in XML format. The proposed system provides the composer with a platform where virtual objects such as sound sources, reflective surfaces, propagating mediums and others can be used artistically to create time varying virtual scenes. The Extended Markup Language (XML) is used to describe and save the content and temporal behaviour of virtual sound scenes or musical compositions. The XML encoded scenes are then parsed by a Java application which in turn sends real-time commands to a signal processing layer implemented in MAX/MSP. Ambisonics 4 th order on a 16-speaker dome is used for spatialisation.
... Full recomputation of the IR is not feasible in realtime ; still, some initial part of the room response must be reconstructed on the fly to accommodate changes in the positions of the early reflections. Similarly to numerous existing solutions [55], [56] , we break the impulse response into first few spatialized echoes and a decorrelated reverberant tail. The direct path arrival and the first few reflection components of IR are recomputed in real time and the rest of the filter is computed once for a given room geometry and boundary. ...
Article
Full-text available
High-quality virtual audio scene rendering is required for emerging virtual and augmented reality applications, perceptual user interfaces, and sonification of data. We describe algorithms for creation of virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects. We use a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements. Details of algorithms for HRTF interpolation, room impulse response creation, HRTF selection from a database, and audio scene presentation are presented. Our system runs in real time on an office PC without specialized DSP hardware.
... A discussion of artificial reverberation models is beyond the scope of this review. Further details can be found in Ahnert and Feistel (1993) ; Dattorro (1997); Funkhouser et al. (2004); Jot (1992 Jot ( , 1997 Moorer (1978); and Schroeder (1962). ...
Article
Full-text available
To be immersed in a virtual environment, the user must be presented with plausible sensory input including auditory cues. A virtual (three-dimensional) audio display aims to allow the user to perceive the position of a sound source at an arbitrary position in three-dimensional space despite the fact that the generated sound may be emanating from a fixed number of loudspeakers at fixed positions in space or a pair of headphones. The foundation of virtual audio rests on the development of technology to present auditory signals to the listener's ears so that these signals are perceptually equivalent to those the listener would receive in the environment being simulated. This paper reviews the human perceptual and technical literature relevant to the modeling and generation of accurate audio displays for virtual environments. Approaches to acoustical environment simulation are summarized and the advantages and disadvantages of the various approaches are presented.
Chapter
ModernSeeAlsoSeeAlsoDigital audio effects processing and circuit technology has made available a number of methods for processing the acoustic signal covering various requirements. Among the different methods, the term effect generally refers to the processing of an existing sound in order to make it more suggestive.
Thesis
Full-text available
In this dissertation, the discussion is centered around the sound energy decay in enclosed spaces. The work starts with the methods to predict the reverberation parameters, followed by the room impulse response measurement procedures, and ends with an analysis of techniques to digitally reproduce the sound decay. The research on the reverberation in physical spaces was initiated when the first formula to calculate room's reverberation time emerged. Since then, finding an accurate and reliable method to predict reverberation has been an important area of acoustic research. This thesis presents a comprehensive comparison of the most commonly used reverberation time formulas, describes their applicability in various scenarios, and discusses their accuracy when compared to results of measurements. The common sources of uncertainty in reverberation time calculations, such as bias introduced by air absorption and error in sound absorption coefficient, are analyzed as well. The thesis shows that decreasing such uncertainties leads to a good prediction accuracy of Sabine and Eyring equations in diverse conditions regarding sound absorption distribution. The measurement of the sound energy decay plays a crucial part in understanding the propagation of sound in physical spaces. Nowadays, numerous techniques to capture room impulse responses are available, each having its advantages and drawbacks. In this dissertation, the majority of commonly used measurement techniques are listed, whereas the exponential swept-sine is described in more detail. This work elaborates on the external factors that may impair the measurements and introduce error to their results, such as stationary and non-stationary noise, as well as time variance. The dissertation introduces Rule of Two, a method of detecting nonstationary disturbances in sweep measurements. It also shows the importance of using median as a robust estimator in non-stationary noise detection. Artificial reverberation is a popular sound effect, used to synthesize sound energy decay for the purpose of audio production. This dissertation offers an insight into artificial reverberation algorithms based on recursive structures. The filter design proposed in this work offers precise control over the decay rate while being efficient enough for real-time implementation. The thesis discusses the role of the delay lines and feedback matrix in achieving high echo density in feedback delay networks. It also shows that four velvet-noise sequences are sufficient to obtain smooth output in interleaved velvet noise reverberator. The thesis shows that the accuracy of reproduction increases the perceptual similarity between measured and synthesised impulse responses. The insights collected in this dissertation offer insights into the intricacies of reverberation prediction, measurement and synthesis. The results allow for reliable estimation of parameters related to sound energy decay, and offer an improvement in the field of artificial reverberation.
Article
Conventional channel-based room equalisation can reduce overall colouration caused by the room response, however it cannot separately correct the colouration caused by the late and early parts of the response, or consider the reverberance in the source signal. A room compensation method is developed here for a source signal in which the dry source sound and the associated target reverberant response are encoded separately, which is possible in an object-based audio framework. The target response is modified using the reproduction room response. Subject to some conditions the combined response approximates the target, with accurate early and late equalisations, reverberant balance, and decay timing. Stochastic assumptions are used to simplify the processing, enabling efficient real-time processing of the encoded audio.<br/
Book
L’essor de la micro-informatique, depuis le début des années quatre-vingt, a contribué à la généralisation de l’outil informatique dans les domaines de la création musicale et de la production sonore. D’un usage jusqu’alors réservé aux centres de recherche, notamment en rapport avec la création contemporaine, l’ordinateur a progressivement investi studios amateurs et professionnels, proposant de nouvelles modalités opératoires qui ont tendu à se substituer aux fonctions de production existantes : commande et programmation de synthétiseurs à travers le protocole MIDI, échantillonnage audionumérique, édition et montage non destructifs de sons numérisés sur disque dur, traitement en temps réel, formalisation et manipulation des structures musicales, synthèse sonore, simulation acoustique, etc. Ces nouvelles applications se sont développées conjointement avec la démocratisation des interfaces graphiques, dont elles ont tiré parti pour proposer de nouvelles formes de représentation et de manipulation des contenus musicaux et sonores. Les concepts sur lesquels reposent les interfaces des différents logiciels existants sont suffisamment stables et convergents pour que leur analyse puisse être envisagée globalement, sur la base de critères communs. La présente étude vise à établir une telle synthèse, en définissant une typologie des interfaces graphiques rencontrées dans les différentes applications de production musicale et sonore, qu’il s’agisse de produits commerciaux ou de développements plus expérimentaux issus de recherches en informatique musicale.
Thesis
Full-text available
Accurate distance cues are important in the degree of realism provided by virtual audio systems. In the last decade there has been an increased interest in this research area. The main focus of this research project is to investigate the effect of different acoustic cues related to distance perception, such as Direct to Reverberant ratio (D/R), in the perception of the relative distance between sound sources in a virtual medium sized critical listening room. The virtual sources were generated by convolving a dry speech signal with modelled and measured BRIRs. The BRIRs were modelled using a direction related image source model for the early reflections and exponentially decaying noise for the reverb tail. In order to investigate relative distance perception and the factors that affect it, a pairwise comparison was conducted involving twentythree subjects. Three different distances ranging between 1.0m and 3.0m were used in the comparison pairs. The main outcomes from the tests are: 1) Modelled and measured BRIRs provide relative distance cues equally well; 2) Direct-to-reverberant ratio is a significant relative distance cue, even when level between virtual sources is normalized; 3) Adding level differences between the sources does not have a significant effect on the perception of relative distance. However, it reduced the precedence of wrong relative distance judgments by 5%-15%; 4) Manipulation of early reflection time of arrival (TOA) does not appear to be a significant cue in distance perception. These findings are important in the field of virtual reality and computer gaming because they show that the relative distance of a virtual source can be manipulated simply by adjusting the direct-to-reverberant ratio of the BRIRs. It can thus be concluded that large BRIR databases and interpolation between BRIRs at different distances are not required for appropriate distance cues.
Article
In this letter, we propose a new auditory distance rendering (ADR) algorithm based on the interchannel phase difference (ICPD) control. In the conventional ICPD control, distance perception of the sound image is nonlinearly controlled, and directional localization of the sound image can be biased by changes of the interaural cues. These problems are caused by applying the frequency-independent ICPD without considering the acoustic transfer paths of the system setup. To solve these problems, first, the interaural cues of ear signals are analyzed by binaural auditory simulations. Then, stereophonic ADR filters are designed that produce ear signals with a linearly controlled interaural cross-correlation (IACC) and a consistent interaural level difference (IALD) for sophisticated distance perception under the given stereo setup. Subjective test results show that the proposed algorithm can provide better distance controllability than the conventional method with reduced lateralization blur of the sound image.
Article
3-D sound can be used to synthesize audio stimuli able to describe spatial information. This can be used as a sensorial substitution of sight for the visually impaired to help them in the task of autonomous orientation and mobility. However, commonly used techniques are computational demanding, therefore not optimal for being implemented in embedded systems. Moreover, the sound localization is specific to each individual and complex to measure or customize. We chose to develop a bottom-up physical model to synthesize a simplified transfer function and playback audio signals over headphones. The model permits the computational requirements to be reduced at the cost of lower accuracy of representation. Still the proposed system can meet the goal of describing spatial information to the listener. Moreover, it can be a promising solution for on-the-go individualization. In this paper we describe the algorithm, the implementation on an embedded platform and present the comparison between HRIR-based synthesis and the proposed simplified physical approach.
Conference Paper
Full-text available
This paper presents algorithms of two digital sound effects based on the NEDCF (Non-Exponentially Decaying Comb Filter). The first part of the paper deals with the description of a NEDCF structure and with the algorithm of the digital sound effect of Echo type with easily controllable parame- ters. The second part describes an extension of the previous algorithm for obtaining a new algorithm of the multi- channel digital reverberation. The reverberation algorithm presented in this paper produces an impulse response with a controllable decay curve, reverberation time and frequency dependent reverberation time. The decay curve can consist of an arbitrary number of increasing or decreasing linear segments which enable the creation of an interesting rever- beration effect.
Article
Full-text available
A digital computer was used to generate four channels of information, which are recorded on a tape recorder. The computer program provides control over the apparent location and movement of a synthesized sound in an illusory acoustical space. The method controls the distribution and amplitude of direct and reverberant signals between the loudspeakers to provide the angular and distance information and introduces a Doppler shift to enhance velocity information.
Chapter
Full-text available
Article
Full-text available
The feedback delay network (FDN) has been proposed for digital reverberation, The digital waveguide network (DWN) is also proposed with similar advantages. This paper notes that the commonly used FDN with an N×N orthogonal feedback matrix is isomorphic to a normalized digital waveguide network consisting of one scattering junction joining N reflectively terminated branches. Generalizations of FDNs and DWNs are discussed. The general case of a lossless FDN feedback matrix is shown to be any matrix having unit-modulus eigenvalues and linearly independent eigenvectors. A special class of FDNs using circulant matrices is proposed. These structures can be efficiently implemented and allow control of the time and frequency behavior. Applications of circulant feedback delay networks in audio signal processing are discussed
Article
Transaural stereo, generic for binaural stereo processed for cancellation of Loudspeaker-to-ear crosstalk, results from the use of minimum-phase filters in shuffler configuration. Simplifying the filters further at short wavelengths makes the listener position noncritical. Full spatial qualities appear in a conventional stereo playback that avoids early reflections. Inverse shufflers provide precise transaural pan functions for multitrack work.
Article
The investigation on the application of ambisonics to the composition and performance of electroacoustic music is reported. Csound implementations of ambisonic encoding and decoding techniques that can be used in any computing platform supporting four or more independent audio output channels are also presented. Moreover, a variety of spatialization techniques is examined within a historical and technical context. This provides the basis for a review of ambisonic theory.
Article
A conceptual model for representing the problems of spatial sound processing is introduced and the implantation of this model is described in the context of the C music sound synthesis program. Two sets of problems are considered simultaneously: the physical characteristics of a space to be simulated, and the psychological characteristics of sounds presented to the listeners over loudspeakers. Details of the model, as loudspeaker placement, virtual acoustic space, radiation vectors, early echo patterns, and global reverberation are given. The extension of the model to three (or more) dimensions is mentioned.
Article
A general approach is proposed to the problem of realizing a recursive digital display network capable of simulating in real time the perceptively relevant characteristics of the reverberation decay in a room. The analysis/synthesis method presented makes it possible to imitate the late reverberation of a given room by optimizing some of the reverberant filter's parameters. The analysis phase is based on a time-frequency representation of the energy decay, computed from an impulse response measured in the room. The energy decay relief is proposed as a spectral development of the integrated energy decay curve introduced by Schroeder. Its three-dimensional representation allows perceptively relevant visual comparison of two room responses (measured or artificial) and accurate calculation of some widely used objective criteria of room acoustic quality.
Article
A block FFT implementation of convolution is vastly more efficient than the direct-form FIR filter. Unfortunately block processing incurs significant input-output delay, which is undesirable for real-time applications. A hybrid convolution method is proposed, which combines direct-form and block FFT processing. The result is a zero-delay convolver that performs significantly better than direct-form methods.
Article
. This paper gives an overview of the principles and methods for synthesizing complex 3D sound scenes by processing multiple individual source signals. Signal-processing techniques for directional sound encoding and rendering over loudspeakers or headphones are reviewed, as well as algorithms and interface models for synthesizing and dynamically controling room reverberation and distance effects. A real-time modular spatial-sound-processing software system, called Spat, is presented. It allows reproducing and controling the localization of sound sources in three dimensions and the reverberation of sounds in an existing or virtual space. A particular aim of the Spatialisateur project is to provide direct and computationally efficient control over perceptually relevant parameters describing the interaction of each sound source with the virtual space, irrespective of the chosen reproduction format over loudspeakers or headphones. The advantages of this approach are illustrated in practical contexts, including professional audio, computer music, multimodal immersive simulation systems, and architectural acoustics.
Article
A unitary n-input n-output linear network preserves the total energy of all input signals. Using the functional calculus of normal matrices, it is proved that feedback round a unitary circuit plus a direct path with suitable gain yields another unitary circuit. This has applications to the design of electronic reverberation units.
Circulant and elliptic feedback delay networks for artificial reverberation
  • D Rochesso
  • J O Smith
Rochesso, D., and Smith, J. O. 1997. "Circulant and elliptic feedback delay networks for artificial reverberation". IEEE trans. Speech & Audio 5(1).