Article

Amplitude panning decreases spectral brightness with concert hall auralizations

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Subjective comparisons of room acoustics require high-fidelity auralizations. Earlier research has shown that room impulse response measurements and directional analysis with the Spatial decomposition method provides accurate reproductions of concert hall acoustics in multi-channel listening. Moreover, timbral aspects have been found important for the overall audio quality, even more than spatial fidelity. This paper explores the effect by the number of true and virtual loudspeakers, and application of amplitude panning, to brightness of the sound. Results from a listening test with concert hall auralizations suggest that amplitude panning reduces the perceived brightness in both loudspeaker and headphone listening.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Nine acoustical conditions were used in this listening experiment: the unmodified sound fields of Rooms A, B, and C, and a set of six acoustical modifications of Room D. They are labeled from Rooms D-1 to D-6, ranging from the lowest measured RT 30 to the highest, respectively. This set provides a range of possible RT of an average living room as described in IEC:60268-13, 11 including the extreme cases of Rooms C and D-6. ...
... The details of the acoustical conditions are summarized in Table I. The measured sound pressure at the listening positions is shown in Fig. 3)(a) and the calculated RT 30 in Fig. 4. ...
... To reproduce the acoustical conditions in the laboratory, the analyzed spatial RIR were convolved with the three program materials, loudness-matched before convolution at -15 dB LUFS . 34 [15,0], [0,90], [15,20], [30,À35], [30,0], [30,45], [50,À10], [50,10], [70,À20], [70,0], [70,25], [90,0], [110,À15], [110,10], [130,0], [130,30], [150,À10], [150,20], [180,À10], [180,10], [180,40] [À15,20], where 0,0 indicates the forward-facing median complementary excite the acoustical conditions under investigation, as typically recommended for audio evaluations. 35 These excerpts formed the three levels of program and are further referred to as music, percussion, and speech, respectively. ...
Article
Full-text available
An experiment was conducted to identify the perceptual effects of acoustical properties of domestic listening environments, in a stereophonic reproduction scenario. Nine sound fields, originating from four rooms, were captured and spatially reproduced over a three-dimensional loudspeaker array. A panel of ten expert assessors identified and quantified the perceived differences of those sound fields using their own perceptual attributes. A multivariate analysis revealed two principal dimensions that could summarize the sound fields of this investigation. Four perceptual constructs seem to characterize the sensory properties of these dimensions, relating to Reverberance, Width & Envelopment, Proximity, and Bass. Overall, the results signify the importance of reverberation in residential listening environments on the perceived sensory experience, and as a consequence, the assessors' preferences towards certain decay times.
... Originally, we applied vector-base amplitude panning (VBAP) [28] with SDM [25], but in this paper we present the nearest loudspeaker synthesis for the SDM samples. This synthesis approach increases the perceived clarity of the spatial sound synthesis, which is a preferred feature according to our study in [29]. ...
... As explained above, the window size is set according to the lowest frequency of each octave band and 50% overlap is used in the synthesis. The spatial sound synthesis is implemented with NLS and using P = 24 loudspeakers, arranged as described for example in [29]. Here, we compare the total magnitude response, i.e., the sum of individual loudspeaker channels' energy ...
Article
This engineering report describes the application of the Spatial Decomposition Method in a car cabin using a compact microphone array. The proposed method provides objective analyses of the sound field with respect to direction and energy, and enables the synthesis for multichannel loudspeaker reproduction. Due to the acoustic complexity of a car cabin, a number of recommended steps is presented to improve both objective and perceptual performance of the method. The results suggest that the method can be successfully applied to car audio systems.
... The data obtained from the spatial analysis in SDM (direction of arrival and pressure) can be used to render impulse responses for the desired directions, by distributing pressure samples. This can be achieved using either VBAP [3] or K-Nearest-Neighbour (KNN) [5], [10] algorithm to the nearest loudspeaker. SDM implementation in this study uses the KNN algorithm to distribute pressure to the nearest point on the employed grid. ...
Conference Paper
Multiple spatial analysis and synthesis methods were proposed up to this date, however, there is still a lack of scientific evidence to conclude how proposed methods and measurement setups compare to each other. This paper presents a subjective study that investigates how spatial analysis and synthesis method affects spatial and timbral fidelities of the binaural renderings for three different stimuli and three source positions. Evaluated positions: (a) +30°, (b) +90°, (c) +135° were motivated by the 4+7+0 loudspeaker layout from the ITU-R BS.2051-2 recommendation. Spatial room impulse responses were measured in the ITU-R BS.1116-compliant listening room using two different microphone arrays. Measurements were used to create auralisations from Spatial Decomposition Method (SDM) and Higher-Order Spatial Impulse Response Rendering (HO-SIRR). The test conditions were selected to investigate: (i) rendering method (ii) direction of arrival (DOA) estimation method, (iii) microphone array, (iv) use of a dedicated centre reference microphone with SDM and (v) band-pass filtering of the spatial room impulse responses prior to the TDOA-based DOA estimation and DOA enforcement for the direct sound. The listening experiment followed MUSHRA-like methodology with a hidden reference, employing a five-grade similarity scale. Subjects’ task was to evaluate how similar the test conditions are to the reference (measured BRIR) with regard to the assessed attribute. This article includes a short review of the existing methods, describes the experimental design and presents the results of the formal listening tests followed by the discussion.
... The lack of energy at high frequencies could be due to two reasons: a non-omnidirectional response of the measurement microphone and a non-coherent addition of the loudspeaker triplet signals that generate every single wavefront [10,11]. In addition, due to the small wavelength at high frequencies, a small misalignment between the microphone position and the center of the listening environment could lead to large deviations in the results. ...
Conference Paper
Full-text available
Studying the influence of room acoustics on musicians requires the possibility of analyzing a live performance in different acoustic conditions. A virtual acoustic environment provides modifiable room acoustic conditions without the necessity of moving into different rooms, thus removing non-acoustic cues and external factors that can influence the performance. This paper presents a virtual acoustic environment for the study of live music performance under changing acoustic conditions. Spatial room impulse responses are measured in different rooms using a microphone array and a sound source on stage. Using the Spatial Decomposition Method, associated impulse responses for an arbitrary loudspeaker set-up are generated. The sound of a musician performing in a quasi-anechoic room equipped with a surrounding loudspeaker set-up is captured with close miking and is convolved with the decomposed impulse responses. Finally, the convolved sound is played back reproducing the sound reflections in real time. The environment is intended to serve as a research tool on interaction of room acoustics and music performance but also as an educational tool to assist musicians’ training in different acoustic conditions. A user friendly graphical interface allows a fast selection of the auralized rooms and performance recording.
... Because the CIPIC HRTFs are obtained in anechoic con- ditions, the HRTFs were preprocessed with an equalisation response to better resemble listening in a normal room. The equalisation response was obtained by 1/3-octave smoothing the average of eight HRTFs on the lateral plane at 45 degree intervals, and raising the smoothed average to the power of α = 0.6 (for more details, see [23]). Several listeners have informally evaluated the equalisation to produce a plausible binaural reproduction using CIPIC responses. ...
Conference Paper
Full-text available
Timing in a music ensemble performance is asynchronous by nature. Asynchrony is generated by the players themselves, and further delays to listeners are introduced by the location and orientation of the instruments on stage. While the musicians aim to an accurate mutual synchronization, deviating from the perfect synchrony may even produce desirable effects. For one, the timbre can appear broader as with orchestra string sections. The perceived asynchrony within an ensemble varies between 20 to 50 ms. This paper studies the perceptual relevance of asynchrony between three orchestral instrument groups in two concert halls. Perfect synchrony was compared to 1) the bass-register instruments (double basses and timpani) played first with delays of 20 ms for middle-register instruments (cellos, bassoon), and 40 ms for treble-register instruments (winds, brass, violas, violins), and 2) the treble-register instruments played first with delays of 20 ms for middle-register, and 40 ms for bass-register instruments. Listener preference was investigated with a paired comparison online listening test using binaural renderings of the concert halls over headphones. The results were analysed with a proba-bilistic choice model with latent preference groups. The analysis shows that listener preference generally depends on the asyn-chrony: the bass-register instrument starting first is the most preferred option in both halls while the treble-register starting first is the least preferred. The results also imply that preference on timing depends on the concert hall, and this requires future listening tests with a spatial audio system in order to reproduce the spatial characteristics of the concert halls more accurately.
... tion (see Pätynen, Tervo, & Lokki (2014) for details). The direction of arrival as a function of time was estimated from the spatial room impulse responses with the recently developed Spatial Decomposition Method . ...
Article
Full-text available
It is known that perception of a warm and full sound requires a rich bass in concert halls. However, earlier research offers an incomplete understanding on the perception of bass in concert halls as the results have been incoherent. In particular, how the excess attenuation of low frequencies due to seats affects the perception of bass, both its level and quality. This paper studies the level of perceived bass and its clarity in four concert halls via paired comparison listening tests. The results suggest that the perceived level of bass is strongly related to the seat-dip effect, in particular to its main attenuation frequency. Moreover, the perceived clarity of bass seems to depend on the musical content and instruments. The results also indicate a complicated relationship between level and clarity of bass, as clarity can be enhanced with a high level of bass in some cases.
Article
Full-text available
Objective: Speech-in-noise tests are widely used in hearing diagnostics but typically without reverberation, although reverberation is an inextricable part of everyday listening conditions. To support the development of more real-life-like test paradigms, the objective of this study was to explore how spatially reproduced reverberation affects speech recognition thresholds in normal-hearing and hearing-impaired listeners. Design: Thresholds were measured with a Finnish speech-in-noise test without reverberation and with two test conditions with reverberation times of ∼0.9 and 1.8 s. Reverberant conditions were produced with a multichannel auralisation technique not used before in this context. Study sample: Thirty-four normal-hearing and 14 hearing-impaired listeners participated in this study. Five people were tested with and without hearing aids. Results: No significant differences between test conditions were found for the normal-hearing listeners. Results for the hearing-impaired listeners indicated better performance for the 0.9 s reverberation time compared to the reference and the 1.8 s conditions. Benefit from hearing aid use varied between individuals; for one person, an advantage was observed only with reverberation. Conclusions: Auralisations may offer information on speech recognition performance that is not obtained with a test without reverberation. However, more complex stimuli and/or higher signal-to-noise ratios should be used in the future.
Chapter
This chapter discusses the acoustics of concert halls from the viewpoint of binaural perception. It explains how early reflections have a crucial role in the quality of sound, perceived dynamics, and timbre. In particular, the directions from which these reflections reach the listener are important for human spatial hearing. The chapter has strong links to psychoacoustical phenomena, such as the precedence effect, binaural loudness, and spaciousness. The chapter discusses which aspects of a concert hall give listeners the impression of intimacy and the perception of proximity to the sound. Moreover, it is explained how a concert hall can change the perceived dynamics of a music ensemble. Examples are presented using measured data from real concert halls.
Thesis
Full-text available
The central topic of this thesis is Reverberation. Reverberation is used as a global term to describe a series of physical and perceptual phenomena that occur in enclosed environments and relate to the acoustical interaction between a sound source and the enclosure. This work focuses on the effects of reverberation that are likely to occur within common listening environments, such as car cabins and ordinary resi- dential listening rooms. In the first study, a number of acoustical fields was captured in a physically modified car cabin and evaluated by expert listeners in a laboratory, using a spatial reproduction system. In the second study, nine acoustical conditions from four ordinary listening rooms were perceptually eval- uated by experienced listeners. The results indicated the importance of decay times in these types of enclosures, even in these theoretically short and non- dominant quantities. It was shown that a number of perceived attributes were evoked by the alterations of the fields both within the same enclosure as well as between different ones. The studies made use of a novel assessment framework, which forms a signifi- cant part of this work. The proposed framework overcomes previously identified challenges in perceptual evaluation of room acoustics, relating to acquisition and presentation of the acoustical fields, as well as the perceptual evaluation of such complex sound stimuli. It was shown that this framework was able to decompose the phenomena that underline the perceived sensations across assessors. The related multivariate analysis techniques employed the conjoint interpretation of both the physical and perceptual properties of the fields in a factorial space and effectively enabled the direct investigation of their relation- ships. Overall the work described in this thesis contributes to: (1) understanding the perceptual effects imposed in the reproduced sound within automotive and residential enclosures, and (2) the design and implementation of a perceptual assessment protocol for evaluating room acoustics. The thesis contains two parts. In the first, the background and rationale of the research project are presented. The second part includes four articles that describe in detail the research undertaken.
Article
Some studies of concert hall acoustics consider the acoustics in a hall as a single entity. Here, it is shown that the acoustics vary between different seats, and the choice of music also influences the perceived acoustics. The presented study compared the acoustics of six unoccupied concert halls with extensive listening tests, applying two different music excerpts on three different seats. Twenty eight assessors rated the halls according to the subjective preference of the assesors and individual attributes with a paired comparison method. Results show that assessors can be classified into two preference groups, which prioritize different perceptual factors. In addition, the individual attributes elicited by assessors were clustered into three latent classes.
Article
An audience's auditory experience during a thrilling and emotive live symphony concert is an intertwined combination of the music and the acoustic response of the concert hall. Music in itself is known to elicit emotional pleasure, and at best, listening to music may evoke concrete psychophysiological responses. Certain concert halls have gained a reputation for superior acoustics, but despite the continuous research by a multitude of objective and subjective studies on room acoustics, the fundamental reason for the appreciation of some concert halls remains elusive. This study demonstrates that room acousticeffects contribute to the overall emotional experience of a musical performance. In two listening tests, the subjects listen to identical orchestra performances rendered in the acoustics of several concert halls. The emotional excitation during listening is measured in the first experiment, and in the second test, the subjects assess the experienced subjective impact by paired comparisons. The results showed that the sound of some traditional rectangular halls provides greater psychophysiological responses and subjective impact. These findings provide a quintessential explanation for these halls' success and reveal the overall significance of room acoustics for emotional experience in music performance.
Article
Full-text available
Spatial impulse response rendering (SIRR) is a recent technique for the reproduction of room acoustics with a multichannel loudspeaker system. SIRR analyzes the time-dependent direction of arrival and diffuseness of measured room responses within frequency bands. Based on the analysis data, a multichannel response suitable for reproduction with any chosen surround loudspeaker setup is synthesized. When loaded to a convolving reverberator, the synthesized responses create a very natural perception of space corresponding to the measured room. A technical description of the analysis-synthesis method is provided. Results of formal subjective evaluation and further analysis of SIRR are presented in a companion paper to be published in JAES in 2006 Jan./Feb.
Article
Full-text available
This paper presents a spatial encoding method for room impulse responses. The method is based on decomposing the spatial room impulse responses into a set of image-sources. The resulting image-sources can be used for room acoustics analysis and for multichannel convolution reverberation engines. The analysis method is applicable for any compact microphone array and the reproduction can be realized with any of the current spatial reproduction methods. Listening test experiments with simulated impulse responses show that the proposed method produces an auralization indistinguishable from the reference in the best case.
Article
Full-text available
A method for recording symphonic music with acoustical instruments in an anechoic chamber is presented. Excerpts of approximately 3 minutes were recorded from orchestral works representing different musical styles. The parts were recorded separately one at a time in order to obtain perfect separation between instruments. The challenge was to synchronize different takes and parts so that they could later be combined to an ensemble. The common timing was established by using a video of a conductor conducting a pianist playing the score. The musicians then played in an anechoic chamber by following the conductor video and by listening to the piano with headphones. The recordings of each instrument were done with 22 microphones positioned evenly around the player. The recordings, which are made freely available for academic use, can be used in research on acoustical properties of instruments, and for studies on concert hall acoustics. This article covers the design, installation, and technical specifications of the recording system. In addition, the post-processing, subjective comments of musicians as well as potential applications are discussed.
Article
Full-text available
Subjective evaluation of acoustics was studied by recording nine concert halls with a simulated symphony orchestra on a seat 12 m from the orchestra. The recorded music was spatially reproduced for subjective listening tests and individual vocabulary profiling. In addition, the preferences of the assessors and objective parameters were gathered. The results show that concert halls were discriminated using perceptual characteristics, such as Envelopment/Loudness, Reverberance, Bassiness, Proximity, Definition, and Clarity. With these perceptual dimensions the preference ratings can be explained. Seventeen assessors were divided into two groups based on their preferences. The first group preferred concert halls with relatively intimate sound, in which it is quite easy to hear individual instruments and melody lines. In contrast, the second group preferred a louder and more reverberant sound with good envelopment and strong bass. Even though all halls were recorded exactly at the same distance, the preference is best explained with subjective Proximity and with Bassiness, Envelopment, and Loudness to some extent. Neither the preferences nor the subjective ratings could be fully explained by objective parameters (ISO3382-1:2009), although some correlations were found.
Article
Full-text available
This article reviews the fundamental ideas of the binaural recording technique. A model is given that describes the sound transmission from a source in a free field, through the external ear to the eardrum. It is shown that sound pressures recorded at any point in the ear canals—possibly even a few millimeters outside and even with a blocked ear canal—can be used for binaural recordings, since they include the full spatial information given to the ear.The sound transmission from a headphone is also described. It is shown how the correct total transmission in a binaural system can be guaranteed by means of an electronic equalizing filter between the recording head and the headphone. The advantage of an open headphone is stated. It is shown that a certain degree of loudspeaker compatibility can be achieved, if the equalizer is divided into a recording side and a playback side. A method for true reproduction of binaural signals through loudspeakers is also described.A number of topical and prospected applications of binaural technology are mentioned. Some of these utilize computer synthesis of binaural signals, a technique which is also described.
Article
How do acoustics affect a concertgoer’s experience? With the right tools, we can learn a lot by asking listeners to tell us in their own words.
Article
Significance The concert hall conveys orchestral sound to the listener through acoustic reflections from directions defined by the room geometry. When sound arrives from the sides of the head, binaural hearing emphasizes the same frequencies produced by higher orchestral-playing dynamics, thus enhancing perceived dynamic range. Many studies on room acoustics acknowledge the importance of such lateral reflections, but their contribution to the dynamic responsiveness of the hall has not yet been understood. Because dynamic expression is such a critical part of symphonic music, this phenomenon helps to explain the established success of shoebox-type concert halls.
Article
Auralizations are commonly used today by architectural acousticians as a tool to model acoustically sensitive spaces. This paper presents investigations employing an auralization methodology known as multi-channel auralizations, to determine the benefits of using an increasing number of channels in such auralizations. First an objective evaluation was conducted to examine how acoustic parameters, such as reverberation time, vary when using “quadrant” (one fourth of a spherical source) or “thirteenth” sources to create the binaural room impulse responses. Large differences in the values were found between the different sections of the sphere, on the order of several just noticeable differences. Two subjective studies were then pursued, first to determine if auralizations made with an increasing number of channels sound more realistic and have an increased perceived source size, using solo musical instruments of varying directivity indices as the sources. Overall, subjects perceived the auralizations made with an increasing number of channels as more realistic, whereas results for perceived source size are less clear. The second subjective study assessed the ease with which subjects could identify the source orientation from the auralizations as a function of number of channels. Results indicate that more channels made it easier for subjects to differentiate between source orientations.
Article
A generalization of the approach of Brockhoff and Skovgaard [Food Quality & Preference, 5 (1994), 215] is given. The emphasis is on univariate assessor performance in sensory profiling. Statistical significance tests for difference between assessors of scaling, variability and sensitivity will be given. A test for disagreement effect is also presented. In addition the approach will provide individual scaling, variability, disagreement and sensitivity values, that can be used for subsequent tabulation, plotting and statistical analysis. The method of maximum likelihood is used throughout and all computations are implemented in a SAS® Macro PANMODEL that is available via the author's homepage: http://www.dina.kvl.dk/∼per.
Conference Paper
Directional audio coding (DirAC) is a recently proposed method for spatial sound reproduction. So far it has been used only in loud-speaker reproduction, and a method for headphone reproduction is presented in this article. In principle, the method uses virtual loud-speakers simulated with head related transfer functions (HRTFs), and head tracking. The method was evaluated subjectively, and the results are presented. The results show that a plausible spatial impression can be reproduced using the binaural realization of DirAC.
Article
Tversky (1972) has proposed a family of models for paired-comparison data that generalize the Bradley-Terry-Luce (BTL) model and can, therefore, apply to a diversity of situations in which the BTL model is doomed to fail. In this article, we present a Matlab function that makes it easy to specify any of these general models (EBA, Pretree, or BTL) and to estimate their parameters. The program eliminates the time-consuming task of constructing the likelihood function by hand for every single model. The usage of the program is illustrated by several examples. Features of the algorithm are outlined. The purpose of this article is to facilitate the use of probabilistic choice models in the analysis of data resulting from paired comparisons.
Article
Mean opinion score ratings of reproduced sound quality typically pool all contributing perceptual factors into a single rating of basic audio quality. In order to improve understanding of the trade-offs between selected sound quality degradations that might arise in systems for the delivery of high quality multichannel audio, it was necessary to evaluate the influence of timbral and spatial fidelity changes on basic audio quality grades. The relationship between listener ratings of degraded multichannel audio quality on one timbral and two spatial fidelity scales was exploited to predict basic audio quality ratings of the same material using a regression model. It was found that timbral fidelity ratings dominated but that spatial fidelity predicted a substantial proportion of the basic audio quality.
Article
A study was conducted with the goal of quantifying auditory attributes that underlie listener preference for multichannel reproduced sound. Short musical excerpts were presented in mono, stereo, and several multichannel formats to a panel of 40 selected listeners. Scaling of auditory attributes, as well as overall preference, was based on consistency tests of binary paired-comparison judgments and on modeling the choice frequencies using probabilistic choice models. As a result, the preferences of nonexpert listeners could be measured reliably at a ratio scale level. Principal components derived from the quantified attributes predict overall preference well. The findings allow for some generalizations within musical program genres regarding the perception of and preference for certain spatial reproduction modes, but for limited generalizations across selections from different musical genres.
Conference Paper
This paper describes a public-domain database of high-spatial-resolution head-related transfer functions measured at the UC Davis CIPIC Interface Laboratory and the methods used to collect the data.. Release 1.0 (see http://interface.cipic.ucdavis.edu) includes head-related impulse responses for 45 subjects at 25 different azimuths and 50 different elevations (1250 directions) at approximately 5° angular increments. In addition, the database contains anthropometric measurements for each subject. Statistics of anthropometric parameters and correlations between anthropometry and some temporal and spectral features of the HRTFs are reported
Spatio-temporal energy measurements in renowned concert halls with a loudspeaker orchestra
  • S Tervo
  • J Pätynen
  • T Lokki
S. Tervo, J. Pätynen, and T. Lokki. Spatio-temporal energy measurements in renowned concert halls with a loudspeaker orchestra. In Proc. Meetings on Acoustics, volume 19, pages 15-19. Acoustical Society of America, 2013.