Article

A Just noticeable difference in C50 for speech

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

C-50 is an early-to-late arriving sound ratio used to assess the influence of room acoustics on the clarity and intelligibility of speech. A just noticeable difference in C-50 values was determined for speech sounds in simulated sound fields. Over a range of C-50 values from -3 to +9dB, representing most situations in rooms for speech, a just noticeable difference was estimated to be 1.1 dB. The corresponding just noticeable difference in Speech Transmission Index (STI) values was 0.03. This is similar to previous related estimates for speech and musical signals. To improve the acoustical characteristics of a room for speech, it is probably necessary to increase C-50 by approximately 3 dB to create a readily detectable improvement in everyday situations.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... It is defined as the logarithmic ratio of early sound energy, arriving in the first 80 ms, to late sound energy, arriving after 80 ms [7]. Most of the research conducted in the last thirty years on C 80 has focused on the determination of its just noticeable difference, JND [8][9][10][11][12]. They use a wide variety of musical motifs, different methods of generating the sound fields, diverse participants and multiple methods of evaluation. ...
... Bradley et al. [9] later conducted another study on JND for clarity, this time focusing on speech, C 50 . They also used synthetic sound fields recreated in an anechoic chamber using 8 loudspeakers. ...
... The method used to obtain the different levels of C 80 is the same as in [9]. It was first applied in [9][10][11], where the aim was to determine the JND for C 50 . ...
Article
Full-text available
C80 is an acoustic parameter that evaluates the influence of the room on the perception of musical clarity. Numerous studies have been carried out to determine its just noticeable difference (JND). In those evaluations different musical pieces have been used, and different results have been observed depending on the pieces. In some studies, the difference was considered significant, while in others it was not. The lack of agreement on the effect that the musical piece has on the assessment of C 80 is the main motivation for this research. A listening test was carried out in order to assess whether the musical motif has an influence on the perceived musical clarity of a room. In studying this influence, it was necessary to use musical motifs with very different characteristics. A total of five pieces were selected: one liturgical vocal piece, one fast tempo solo piece, one slow tempo solo piece and two orchestral pieces, one fast and one slow. These motifs were used to create the stimuli used in the test by convolution with impulse responses of rooms with different C 80 levels. Impulse responses were obtained by modifying the virtual models of three rooms. Eighteen impulse responses were used with C 80 levels between À5.90 and 6.75 dB. The listening test was carried out by 36 participants with varying degrees of musical training. It was intended to answer two different questions. Firstly, whether the participants were able to per-cieve C 80 changes in all musical motifs. And secondly, whether the musical motif was a significant factor when assessing the C 80 level of a stimulus. The first question was assessed using regression analysis and Cochran-Mantel-Haenszel (CMH) analysis. It could be observed that the participants were able to notice changes in the C 80 level for the solo instrument pieces and the vocal motif. For the orchestral pieces, on the other hand, this differentiation ability could not be seen. To assess whether the motif is a significant factor in evaluating a C 80 level, a CMH analysis was performed. The results indicate that, for all rooms, musical motifs are a significant factor in assessing the clarity of a venue. Consequently, it can be stated that the musical motif used for the evaluation of the clarity of a room by means of a listening test has an influence on the results obtained.
... (1)) is a ratio of early (I 50 0 [W m À2 ]) and late (I 1 50 [W m À2 ]) sound intensity received at a point, expressed in dB. It is calculated from an impulse response, and used for the subjective assessment of intelligibility [4] with the scales proposed by Marshall [5] (Fig. 1). This parameter was introduced in 1949 by Haas [6], who observed that a sound deteriorates as soon as its reflections are perceived with a certain delay: 50 ms for speech and 80 ms for music. ...
... To enhance it, two design strategies are possible: either to increase the energy of the numerator by adding early reflections, or to reduce the energy of the denominator by dampening late reflections [9]. The Just Noticeable Difference (JND) on C 50 is estimated to be 1 dB, however, any change becomes remarkable after a 3 dB shift [4]. ...
... In the first test, the size of the receiver is fixed at 0.10 m and the number of rays varies. According to Fig. 8, from 50 k rays, the standard deviation between the 10 simulations is lower than 1 dB, which is below the Just Noticeable Difference (JND) for C 50 [4]. We wish to obtain the number of rays and the size of the receiver that ensures a standard deviation lower than 0.10 dB. ...
Article
We present a new graphical representation of multiple reflections for the acoustic refurbishment of rooms with low clarity values. This representation shows the spatial, temporal, and energy distribution of the incident reflections on a large number of receivers. It is based on a high-performance raytracing operation visualized on a panoramic Mollweide projection. The result is an abstract depiction that could not be measured with an acoustic camera. This representation method is used through two case studies of acoustic refurbishment: In the first one, to determine the ideal location of absorbent panels and in the second one, to improve the shape of a reflecting device in a chapel. In both cases, the images aid the designer to identify the best solution and to adapt it to the specific constraints of the project.
... The results are summarized in standards [11], which provide an objective method to predict the subjective sensitivity and in particular introduce the concept of Just J o u r n a l P r e -p r o o f Noticeable Differences (JND), i.e. the minimum objective difference in some kind of measured quantity which a human being can perceive. Unfortunately, clarity of the speech is not included in the standard database, but the literature provides proper thresholds [12]. In this view, Harvie-Clark and Dobinson used JNDs to determine the practical application in the indoor spaces [13], Bistafa and Bradley [14] used them to grade the variations of indoor acoustic field conditions, Peng et al. [15] used JND to rate the acoustic field and to correlate it to student preference. ...
... For the clarity of the speech, again ISO 3382-1 does not provide a reference. In literature, the work Bradley et al. [12] is considered as reference, which indicates a JND equal to 1 dB. ...
... Position-averaged results are reported in Figure 7. (JND) are present [11,12], while configuration C is notably different. In order to objectively demonstrate the Is the placebo effect present? ...
Article
When studying occupants' environmental perception of well-being in indoor spaces, subjective questionnaires are most often used to evaluate indoor comfort domains. When estimating parameters which can be influenced in some way by certain visual aspects, researchers may question if the obtained results could be affected by other issues. In particular, certain aesthetic aspects, such as indoor furnishing or room shape may have impact on occupants’ perceptions. For this reason, this paper focuses on determining the influence of the visual aspect on occupants’ perception of indoor acoustics. Classrooms were chosen as an example of an enclosed environment; we modified the indoor acoustics of a real classroom twice, inserting false and real sound absorbing panels featuring an identical external layout. Following this, using questionnaires, the indoor acoustics perceived by students were analyzed. Results clearly demonstrate that the sight of false absorbers is sufficient to provide the same perceived acoustics as real absorbers. Since only the visual aspect is varied, a placebo-like effect is assessed. Therefore, this issue can bias results when using surveys as a scientific approach in experimental studies.
... In a recent study by Astolfi et al., speech clarity was found to be the most suitable parameter to classify the room acoustic condition of a school classroom [28]. The JND for speech clarity parameters is expected to be of the order of 1 dB in rooms for speech [29]. -Unoccupied sound pressure levels in preschool rooms are regulated nationally in Sweden [30]. ...
... It includes effects of the room, especially the share of early-to-late acoustic energy, and the effects of signal-to-noise level. Correlations between STI with C 50 and U 50 (useful-to-detrimental sound ratio) have been reported [29,35]. With a stationary transmission path and knowledge of the source strength and effective background level, the STI has been found to correlate well with subjective speech intelligibility, and this is also the case for schoolchildren [36]. ...
Article
Full-text available
Preschool should promote children’s well-being and development, but the indoor sound environment is commonly problematic. The aim of our research project Supportive Preschool ACoustic Environment (SPACE) is to identify acoustic quality factors resulting in a supportive sound environment for children. This paper presents the first phase of the project where acoustic conditions were measured in unoccupied preschool rooms and analysed in terms of reverberation time, early decay time, sound strength, speech clarity, unoccupied sound pressure levels, and several room features. The results were compared with current target values, building year, and socioeconomic status of the preschool. A child perspective on room acoustics was, in addition, applied and it was revealed that children may be exposed to a lower sound strength than adults, and that adults may have better speech intelligibility conditions than children. Rooms in newer buildings had a longer reverberation time in the 125 Hz band, lower unoccupied levels, and lower sound strength. These differences could be explained by the trend towards larger rooms and porous acoustic ceilings in newer buildings. We found no significant correlations with the socioeconomic status. Ongoing work will facilitate an analysis of the correlation between the room acoustic parameters, the sound environment and children’s perception.
... Similar observations were found for the better ear evaluation method. However, it must be mentioned that all differences were lower than 0.03, which is considered as the just-noticeable difference for the STI [13]. Figure 6. ...
... This effect was reduced in this study by common facing of the HATSs towards the omni-directional source, which explains the small differences between the left and right ear resulting in little differences between the bSTI and the reference STI. Differences found in this work were below the justnoticeable difference [13] of STI based on investigation including adult participants. It can, therefore, be assumed that the differences are negligible. ...
Conference Paper
Full-text available
Acoustic measurements conducted using head and torso simulators (HATS) are considered to represent natural human hearing more realistically in comparison to measurements conducted using omnidirectional microphones. Traditionally, HATS are designed and built with respect to the anthropometric data of adults. Correspondingly, evaluation methods and metrics were primarily developed based on adults. Nevertheless, children are a major group of interest in learning spaces, and usually, they have different anthropometric head and torso dimensions than adults. This fact leads to the question of whether existing acoustic assessment methods are also valid for children. This work explores the differences in the speech transmission index (STI) derived from measurements using HATS with different anthropometric sizes with respect to children and adults.
... The analysis of simulation results, compared with measurement findings also in terms of just noticeable difference (JND) units in STI values, 40 illustrates in which settings and environments the tool exhibits greater robustness for its use in preliminary or design assessments of speech intelligibility in classrooms. ...
... The study associated with the just noticeable difference in STI, that is, the variation in STI values for which 50% of subjects can perceive the difference, determined a STI JND equal to 0.03 in simulated sound fields, but a STI JND of 0.1 is considered more realistic in everyday listening situations. 40 For this reason, we have calculated the number of JND units (STI JNDs) between STI values from measurements and simulations with both thresholds. ...
Article
According to Italian regulation, the Ministerial Decree of 11 October 2017 about Environmental Criteria, reference values for acoustic indoor quality descriptors in public buildings are imposed. Regarding school environments, indoor acoustic quality targets refer to reverberation time, clarity, and speech intelligibility, whose representative acoustic descriptor is the speech transmission index (STI). This paper presents pyeSTImate, a Python-based tool for speech transmission index prediction in lecture rooms. The tool returns fully simulated results from the dimensions and material characteristics of classrooms with parallelepiped geometry and without limitations in size. Extensive experiments have been conducted with different simulation methods, evaluating the accuracy by comparison with in situ measurements selected from primary, secondary, and university classrooms in school buildings of the Marche Region in Italy. The combination of simulated speech transmission indexes with a prediction method based on an artificial neural network has also been evaluated. The analysis of the performance demonstrates the computational robust-ness of the tool that enables its use for the analysis of existing rooms, as well as for the renovation and design of new spaces.
... In terms of MRE, the error for all the parameters is always below 10%, with the worst case given by C80 (9.29%), and close to 2% for ST I and SII. Note that the accuracy achieved may be enough for common acoustic monitoring applications, as the standard deviation of errors tend to be below the just noticeable difference (JND) [47], [48]. Figures 3, 4, 5, 6 and 7 show scatter plots for the values predicted by the model against the actual ones, providing a more descriptive view of the prediction error. ...
... Nonetheless, the overall system performance returns the average MAE in prediction across the 6 nodes shown in last [47], [48] although with less distance than in the test partition evaluation. The prediction errors for C50, C80 and SII are little higher and may be due to the unwanted effects discussed above, however the errors for RT 60 and ST I are slightly lower than in the test partition evaluation. ...
Article
Room acoustical parameters have been widely used to describe sound perception in indoor environments, such as concert halls, conference rooms, etc. Many of them have been standardized and often have a high computational demand. With the increasing presence of deep learning approaches in automatic monitoring systems, wireless acoustic sensor networks (WASNs) offer great potential to facilitate the estimation of such parameters. In this scenario, Convolutional Neural Networks (CNNs) offer significant reductions in the computational requirements for in-node parameter predictions, enabling the so-called Artificial Intelligence-Internet of Things (AI-IoT). In this paper, we describe the design and analysis of a CNN trained to predict simultaneously a set of common room acoustical parameters directly from speech signals, without the need for specific impulse response measurements. The results show that the proposed CNN-based prediction of room acoustical parameters and speech intelligibility achieves a relative error rate of less than a 5.5%, accompanied by a computational speedup factor close to 250 with respect to the conventional signal processing approach.
... This enables the SNR to be linearly converted to the BSTI, e.g., a SNR increase of 6 dB results in a BSTI increase of 0.2. Thus, the just-noticeable-difference (JND) of 0.03 [33] for BSTI approximately corresponds to a JND of 0.9 dB for SNR, as indicated by the auxiliary line in Fig. 2. ...
... This is similar to the results in Figs. 1 and 2. The average SDs of individual BSTIs are 0.03, 0.025, 0.025 and 0.023 for speaker distances of 0.2 m, 0.3 m, 0.5 m and 1.0 m, respectively. Except for a distance of 0.2 m, the average SD is less than the JND of the BSTI [33]. It seems that the individual differences in the BSTIs from individual subjects' HRTFs can almost be ignored. ...
Article
As well as the background noise and acoustic conditions of a given enclosed space, the binaural effect has a significant influence on the binaural speech transmission index (BSTI), especially for nearby sources. This effect can be quantitatively described by head-related transfer functions (HRTFs). As HRTFs are highly individual, the BSTIs from one person's HRTF should differ from those of other people. However, this issue has rarely been studied. In the current work, we used the near-field HRTFs of 56 people to obtain the corresponding BSTIs, and analyzed the individual differences among them. The results show that the average standard deviation among the 56 individual BSTIs is almost within the just-noticeable-difference level of 0.03, and decreases with increasing sound source distance. We then acquired BSTIs from the KEMAR manikin and found that there were no significant differences with the median BSTIs across the 56 subjects. Thus, using the BSTIs of the KEMAR manikin, we developed regression BSTI models as a function of the azimuth based on seventh-order polynomial regression at different source distances.
... In the latter, listening tests are mainly used for three purposes. First, to describe the just noticeable differences (JNDs) of room acoustical parameters [11,[15][16][17][18][19][20][21][22], i.e., to determine the minimum variation that must occur in a parameter for that change to be noticeable. Second, to evaluate whether the difference between two or more stimuli is perceptible or not [13,14,[23][24][25][26][27][28][29] (e.g., the difference between real recordings and auralizations, between auralizations with different sound source directivity patterns, etc.). ...
... Both s and b CDS are supported by A/NotA-R (and consequently by CR-SD) as described in [59]. These protocols have been selected because the method of minimal changes [48] and variations of it, frequently used for the determination of JNDs in the area of room acoustics [11,15,17,21], are procedurally based on a SD protocol with constant reference. However, the particular variation of the protocol used by researchers varies among studies: some of the researchers inform the participant of the existence of a constant reference, while others omit this information. ...
Article
Listening tests are key to evaluate the perception of difference between confusable auditory stimuli, among other purposes. In room acoustics, the usefulness of listening tests is beyond question. They have been extensively employed and have allowed relevant and interesting conclusions to be drawn for different purposes, such as the determination of the just noticeable differences (JNDs) of room acoustical parameters and the evaluation of subtle differences between auralizations, among many others. However, the lack of methodological consensus has led research with similar goals to yield uneven results. Among the possible causes is the fact that different testing protocols are employed to present the stimuli and to question the participants, which can have a significant influence on the discrimination abilities if the sensitivity of the various protocols is not comparable. In room acoustics, nevertheless, few studies have been conducted so far to assess the impact that these protocols may have on discrimination. Consequently, it might be of great interest to carry out experiments aiming to compare several protocols with a population of participants as wide as possible. In this context, the purpose of this research is to evaluate the operational power of several protocols that can be employed to compare confusable auditory stimuli. To this end, a listening test has been carried out on a large population of 134 participants in which the discrimination performance of seven protocols has been assessed. Some of these protocols have been selected among the most widely used in room acoustics, while others have been chosen as they have been found to be operationally powerful in other fields of sensory discrimination. The results, analyzed by means of the Thurstonian measure of sensory difference d-prime (d′), have allowed to assess the operational power of each protocol, as well as to evaluate the influence that experimental effects such as those of sequence, learning and fatigue have on their discrimination capabilities. This evaluation has revealed that the protocol has a significant influence on the discrimination ability, making it important to pay close attention to this fact when designing listening tests in room acoustics. Furthermore, it has allowed to identify the most recommended protocol among those tested in the study regarding both its remarkable operational power and its low propensity to experimental effects.
... The vertical walls of the room were then oriented on the plane view, in such a way as to provide some lateral reflections, thus enhancing the enveloping sensation [2]. The acoustician ensured that the delay between these first reflections and the direct sound never exceeded 50 milliseconds in theaters or lecture rooms, or 80 milliseconds for music [3]. Early raytracing software was very slow, and could only be used to study theoretical problems, for example, to show that absorbing materials were more effective in reducing reverberation if they were dispersed throughout the room rather than concentrated in one place [4]. ...
Conference Paper
Twenty years ago, the Radit2D program offered acousticians an interactive tool to design the shape of a room cross-section with the first reflection. Based on the method of images, it was enough for the sequence of straight-line segments drawing the enclosure to follow a regular curve for its orthotomy to appear by connecting the points of the image. This new curve quickly appeared as a guide, especially when it was necessary to direct the sound energy toward a particular area of the room, without focusing or dispersing it. We then studied the logarithmic spiral, the best candidate to assume this role, typically for the acoustic ceiling of a large conference room. Recent advances in raytracing techniques, with extremely fast shots computed directly in the CPU, now allow us to provide a new representation of the set of specular reflections in three dimensions, not only for a particular receiver chosen from the public, but also, and simultaneously, for the entire concerned public. This makes it possible, for example, to find the ideal shape of a concert hall, as complex as it may be, which guarantees for the whole audience the best possible compromise between reverberation and Lateral Energy Fraction.
... Five additional sounds (a dunnock, a baboon, a river, an impala and a forest atmosphere) were used in the first five trials of each block, but the data obtained from these trials were not included in the analysis. These five trials were used to allow participants to get immersed in the two different acoustic conditions (art+ and art−), and to ensure that the task to be carried out was clearly understood (Bradley et al., 1999;Witew et al., 2005;Vigeant and Celmer, 2010;Marquis-Favre et al., 2019). These additional sounds were extracted from the BBC sound effects repository; details are provided in Supplementary Table S2. ...
Article
Full-text available
The major goal of psychoarchaeoacoustics is to understand the psychology behind motivations and emotions of past communities when selecting certain acoustic environments to set activities involving the production of paintings and carvings. Within this framework, the present study seeks to explore whether a group of archaeological rock art sites in Altai (Siberia, Russia) are distinguished by particular acoustic imprints that elicit distinct reactions on listeners, in perceptual and emotional terms. Sixty participants were presented with a series of natural sounds convolved with six impulse responses from Altai, three of them recorded in locations in front of rock art panels and three of them in front of similar locations but without any trace of rock art. Participants were interrogated about their subjective perception of the sounds presented, using 10 psychoacoustic and emotional scales. The mixed ANOVA analyses carried out revealed that feelings of "presence," "closeness," and "tension" evoked by all sounds were significantly influenced by the location. These effects were attributed to the differences in reverberation between the locations with and without rock art. Despite results are not consistent across all the studied rock art sites, and acknowledging the presence of several limitations, this study highlights the significance of its methodology. It stresses the crucial aspect of incorporating the limitations encountered in shaping future research endeavors.
... The just noticeable difference (JND), also referred to as the difference threshold, is the minimum change in stimulus intensity required to produce a noticeable difference in sensory experience [1]. The JND has been widely applied in the multimedia domain, such as audio perceptual assessment [2]- [4] and watermarking [5]. It has also been used to determine the optimal compression level for images [6], [7] and videos [8]. ...
... The red dotted lines indicate a range of ±0.3 STI which is the uncertainty associated with the STIPA method [4]. This value is also widely accepted as approximately the Just Noticeable Difference (JND) for the STI and STIPA [3,20]. ...
Article
Full-text available
Objective speech intelligibility estimations undertaken in natural acoustics speech communications (NAS) scenarios require the utilization of a speech source that approximates the acoustic characteristics of a human talker. Only a limited number of special speech sources that conform to the specifications in the relevant guidelines are available in the market; however, they can be deemed expensive by professional practitioners and other users. Non-special and affordable loudspeakers are often used in NAS investigations in place of standardized special speech sources without the knowledge of their suitability and results validity. This study aims to examine the suitability of a range of representative common and affordable non-special loudspeakers as a potential alternative to standardized speech sources in NAS indicative or pilot investigations. Frequency response and Speech Transmission Index Public Address (STIPA) experimental results obtained from a reference standardized speech source were compared against results from various non-special loudspeakers measured utilizing diverse and real-world representative combinations of NAS acoustic conditions under controlled laboratory conditions. STIPA mean absolute errors for the alternative speech sources were generally lower than the STIPA method uncertainty and one Just Noticeable Difference (0.03 STI). The findings of this study will inform practitioners of the suitability of affordable loudspeakers when standardized special test loudspeakers are not available.
... The just noticeable difference (JND), also referred to as the difference threshold, is the minimum change in stimulus intensity required to produce a noticeable difference in sensory experience [1]. The JND has been widely applied in the multimedia domain, such as audio perceptual assessment [2]- [4] and watermarking [5]. It has also been used to determine the optimal compression level for images [6], [7] and videos [8]. ...
Preprint
Full-text available
The just noticeable difference (JND) is the minimal difference between stimuli that can be detected by a person. The picture-wise just noticeable difference (PJND) for a given reference image and a compression algorithm represents the minimal level of compression that causes noticeable differences in the reconstruction. These differences can only be observed in some specific regions within the image, dubbed as JND-critical regions. Identifying these regions can improve the development of image compression algorithms. Due to the fact that visual perception varies among individuals, determining the PJND values and JND-critical regions for a target population of consumers requires subjective assessment experiments involving a sufficiently large number of observers. In this paper, we propose a novel framework for conducting such experiments using crowdsourcing. By applying this framework, we created a novel PJND dataset, KonJND++, consisting of 300 source images, compressed versions thereof under JPEG or BPG compression, and an average of 43 ratings of PJND and 129 self-reported locations of JND-critical regions for each source image. Our experiments demonstrate the effectiveness and reliability of our proposed framework, which is easy to be adapted for collecting a large-scale dataset. The source code and dataset are available at https://github.com/angchen-dev/LocJND.
... When the correlation between SNR and STI values for instruction situation were analysed, the present results indicate that the relation between SNR and STI show a weak to moderate correlation (R 2 = 0.34). The increment of around 7 dBA of SNR from normal voice effort to raised vocal effort would provide only minimal improvement of around 0.02 of STI in Classroom-P, which cannot be even described as just noticeable difference (JND) [73]. This result indicates that after 12 dBA SNR is reached the increase is not much effective on STI results in the acoustically treated classroom. ...
... In addition, an SPL(A) comparison at the same receivers shows that SPL(A)s in Cond. 1 present more than 1 dB larger values than those of Cond. 2. The maximum difference is 2.3 dB at R6. A comparison of RASTI between both conditions shows that RASTI values in Cond. 2 take larger values more than JND value of speech transmission index of 0.03 [39] at all receivers than those of Cond. 1. ...
Article
Full-text available
This paper presents a proposal of an efficient binaural room-acoustics auralization method, an essential goal of room-acoustics modeling. The method uses a massively parallel wave-based room-acoustics solver based on a dispersion-optimized explicit time-domain finite element method (TD-FEM). The binaural room-acoustics auralization uses a hybrid technique of first-order Ambisonics (FOA) and head-related transfer functions. Ambisonics encoding uses room impulse responses computed by a parallel wave-based room-acoustics solver that can model sound absorbers with complex-valued surface impedance. Details are given of the novel procedure for computing expansion coefficients of spherical harmonics composing the FOA signal. This report is the first presenting a parallel wave-based solver able to simulate room impulse responses with practical computational times using an HPC cloud environment. A meeting room problem and a classroom problem are used, respectively, having 35 million degrees of freedom (DOF) and 100 million DOF, to test the parallel performance of up to 6144 CPU cores. Then, the potential of the proposed binaural room-acoustics auralization method is demonstrated via an auditorium acoustics simulation of up to 5 kHz having 750,000,000 DOFs. Room-acoustics auralization is performed with two acoustics treatment scenarios and room-acoustics evaluations that use an FOA signal, binaural room impulse response, and four room acoustical parameters. The auditorium acoustics simulation showed that the proposed method enables binaural room-acoustics auralization within 13,000 s using 6144 cores.
... The method-comparing the energy from accurate direct and early reflections with the late reflections and reverberations to predict speech articulation-has been used for many years. The most common measurement index is C50, which calculates the ratio of the first 50ms sound energy to the entire impulse response sound energy (Bradley, Reich, & Norcross, 1999). ...
Article
Full-text available
The acoustic analysis plays an irreplaceable role in obtaining information on Chinese Buddhist main halls tradition and relative practice that includes sound, deepening our comprehension of Chinese Buddhist architectural heritage. Various ceremonies and dojos constitute rich types of Buddhist main halls’ sound fields. In this paper, the indoor sound fields of four main halls in the Buddhist temples are researched and compared. This paper used Sketchup to build the models and simulate sound fields when conducting ceremonies and dojos using Comsol Multiphysics software. The four main halls are Chongshan temple main hall, Xiantong temple main hall, Shuxiang temple Manjusri hall and Bodhisattva Top main hall, which are located in separately four temples of Shanxi Wutai mountain in China. Targeting three acoustic parameters including the reverberation time(RT), the first ray arrival time(Re1first) and surface sound pressure level(SPL) distribution, we simulated the acoustic effect of the space occupancy, Buddha realm space and worship space, indicating that the acoustic wave diffusion rate was positively correlated with proportion of hall height to depth, while the first arrival time is exactly the opposite. The largest RT at 2000Hz(about 1.3s)in the shortest period for 500Hz voice was observed in the main hall of Pusa Peak, while T60 even reached 4s in Xiantong temple main hall. The acoustic wave transmission rate was positively correlated with proportion of hall’s height to depth, but the first ray arrival time was the opposite. The main hall of Pusa Peak had the shortest (0.0150s) first ray arrival time,the Shuxiang temple main hall had the longest time(0.0381s). In all the cases, appearing of the "sound shadow area" of the surface SPL distribution and the uneven sound energy distribution showed pillars in the middle space exerting significant impact on the acoustics of the Daxiong main hall.
... Parameter kedua adalah speech clarity C50 (dalam dB) yang didefinisikan sebagai logaritma rasio energi suara pada pantulan awal (< 50 ms) terhadap energi suara pada pantulan yang muncul kemudian (> 50 ms) (Bradley et al., 1999). Asumsi yang digunakan adalah pantulan awal (< 50 ms) merupakan bagian suara yang dapat ditangkap secara jernih oleh pendengar sedangkan pantulan yang terjadi di atas 50 ms memudarkan kejernihan suara dan mengaburkan pemahaman kalimat oleh pendengar. ...
Article
Full-text available
Tulisan ini mengkaji kualitas parameter akustik, dari lima ruangan di Univeristas XYZ, yang terbagi ke dalam tiga klaster fungsi (studio, ruang rapat, dan kelas). Rekaman suara dari sumber suara impulsif dianalisis dengan menggunakan perangkat lunak REW untuk menghasilkan reverberation time RT60 dan speech clarity C50 yang dapat digunakan untuk memeringkat kelima ruangan tersebut. Analisis menunjukkan adanya korelasi yang erat antara volume ruangan, material penyusun ruangan, dan isi ruangan terhadap kualitas akustik ruangan.
... Based on the original impulse responses with relative magnitude aligned to 0 dB in 1000 Hz and those after being equalized by employing different equalization algorithms, the corresponding STIs were calculated based on the impulse-based indirect method, as shown in Figure 3. Note that, except for the 8010 source, the STIs after being equalized obviously vary from those before being equalized, with difference of more than 0.3 in some cases, far exceeding the just-noticeable-difference (JND) of 0.03 [16]. On the one hand, the impulse response used for equalization introduces additional energy to STI measurement to a large extent, and it means a large STI difference between before and after being equalized. ...
Conference Paper
Full-text available
Frequency response of sound source may be one of the main causes of the inaccuracy in speech transmission index (STI) measurement. Thus, the frequency response equalization of sound source is important to satisfy the STI measured requirement. However, the performance of different equalization methods on STI measurement has rarely been discussed in previous research. This study therefore investigates the effect of equalization algorithms, including the Kirkeby algorithm and the minimum-phase reconstruction algorithm, and the effect of magnitude normalization methods of the frequency response used for equalization on the STI measurement. First, the impulse responses were measured in an anechoic room with three types of directional loudspeakers, and then used to calculate the corresponding STIs with impulse-based indirect method. Results show that compared with the Kirkeby algorithm, the equalization algorithm using minimum-phase reconstruction can obtain a flatter frequency response. The STI difference caused by equalization depends not only on the frequency response of the sound sources, but also on the magnitude normalization methods of the frequency response used for equalization. It recommends to use an energy-normalized frequency response for equalization, so as to avoid the introduction of additional energy.
... The direct-to-reverberant energy ratio (DRR) yields only 4 discriminable levels, based on the measured JNDs by Larsen et al. (2008). In case of center time (Cox et al. 1993) as well as for clarity C 50 (J. S. Bradley et al. 1999), a total of 11 JNDs fit in the observed range. ...
Thesis
Full-text available
We live in a physical environment in which interactions with physical objects evoke sound. This auditory feedback conveys information on the involved objects and on the specific type of interaction. We constantly adapt to the auditory feedback, even unconsciously, while pursuing our everyday activities. The digital environment which is becoming increasingly important lacks this immediate connection. It thus becomes necessary to project the digital information into the physical world in a plausible and usable way. With the visual domain being already overloaded, we propose the use of auditory augmentation to provide a calm communication channel by adding augmented auditory feedback to physical objects or interactions. Auditory augmentation is investigated within this thesis in 6 different ways. (1) We present experimental platforms for invisible auditory augmentation of everyday objects as well as for exploring the limits of plausibility. (2) As even naive listeners are already skilled in the interpretation of sounds of physical origin, we propose a physical sound model to synthesize augmented auditory feedback that integrates seamlessly in the everyday acoustic environment. (3) We review how physical information is encoded in auditory feedback, and investigate what portion of it is actually perceived and interpreted by human listeners, with a focus on rectangular plates. (4) We present an algorithm that successfully identifies material, size, and shape from sound. (5) We introduce an algorithm for auditory contrast enhancement which makes certain sound characteristics more salient. (6) We explore what kinds of data, physical objects, and interactions are suitable for auditory augmentation, and how much information it allows to monitor in the periphery of attention. Conclusions are drawn based on several case studies of auditory augmentations. This thesis provides the theoretical foundations as well as practical solutions and guidelines for designers of future auditory augmentations.
... However, the lowest value in this range is 0.27 while the threshold for "poor" is 0.3. This difference is equal to one JND for STI (Bradley et al., 1999). DL2 DL2 or the spatial decay of sound pressure per distance doubling was derived for a path of receivers in the line of sight of the sound source in the simulation. ...
... It is worth noting that the maximum differences in the STI 10 and STI 90 among the three curves are up to 0.27 and 0.06, respectively (see Fig. 4). Bradley et al. [52] found that a just noticeable difference (JND) of STI is 0.03, which means that the differences in the STI 10 and STI 90 are large. One possible explanation could be the difference in language environments. ...
Article
Speech noise can reduce occupants' work performance in open-plan offices. Some models have been created to predict the effect of speech of different intelligibility on work performance. However, few of them consider the effects of speech intelligibility in Chinese environments. This study aims to develop a model that evaluates how much work performance is decreased by speech noise with different intelligibility in Chinese open-plan offices. A laboratory experiment has been conducted in this paper to determine the effects of different speech intelligibility on occupants' objective performance of the serial recall task and perceived speech disturbance in Chinese open-plan offices. Then, a prediction model was developed by analyzing the data from this experiment and two previous studies. These two studies researched the effects of the Speech Transmission Index (STI, an objective parameter of speech intelligibility) on the serial recall performance in Chinese environments. According to the prediction model of serial recall performance in Chinese environments, performance decrease occurs within the STI range of 0.31–0.47. The comparison of curves between STI and DP with previous studies shows that the STI range for serial recall performance variation in Chinese environments is narrower than in non-Chinese language environments. Furthermore, the DP average change rate of serial recall tasks in Chinese environments is not less compared to non-Chinese environments, although the effect of speech noise on serial recall performance is lower in the Chinese environment.
... When considering an early-to-late ratio for a limit of 50ms, consistent with the definition of speech clarity C50, the difference between measurements and generated impulse responses is around 1dB. For speech signals, the just-noticeable-differences of C50 are around 1.1 dB and a significant change value is considered to be around 3dB [24]. The early energy difference observed between the measured impulse responses and those generated by the envelope-based model is inferior to 3dB. ...
Thesis
Full-text available
This thesis aims to investigate a variety of effects linking the auditory distance perception of virtual sound sources to the context of audio-only augmented reality (AAR) applications. It focuses on how its specific perceptual context and primary objectives impose constraints on the design of the distance rendering approach used to generate virtual sound sources for AAR applications. AAR refers to a set of technologies that aim to merge computer-generated auditory content into a user's acoustic environment. AAR systems have fundamental requirements as an audio playback system must enable a seamless integration of virtual sound events within the user's environment. Different challenges arise from these critical requirements. The first part of the thesis concerns the critical role of acoustic cue reproduction in the auditory distance perception of virtual sound sources in the context of audio-only augmented reality. Auditory distance perception is based on a range of cues categorized as acoustic, and cognitive. We examined which strategies for weighting auditory cues are used by the auditory system to create the perception of sound distance. By considering different spatial and temporal segmentations, we attempted to characterize how early energy is perceived in relation to reverberation. The second part of the thesis's motivations focuses on how, in AAR applications, environment-related cues could impact the perception of virtual sound sources. In AAR applications, the geometry of the environment is not always completely considered. In particular, the calibration effect induced by the perception of the visual environment on the auditory perception is generally overlooked. We also became interested in the instance in which co-occurring real sound sources whose placements are unknown to the user could affect the auditory distance perception of virtual sound sources through an intra-modal calibration effect.
... However, there are experimental studies that have analyzed the minimum differences of this parameter in order to be perceived by the human ear. Bradley et al. estimated a JND of 1.1 dB, with an increase of approximately 3 dB being necessary to create an easily detectable improvement in everyday situations [40]. With all this, it is important to note that today there is no definitive consensus accepted by the entire scientific community [41]. ...
Article
Full-text available
The acoustic evaluation of indoor environments is common in the application of virtual acoustics. In addition, in the study of cultural heritage buildings, it is a useful tool, but it is not so common when describing the acoustic environment in intangible cultural heritage events and even in outdoor environments. In this paper, the acoustic environment of the Water Tribunal of the Plain of Valencia (Spain) is studied. It is analyzed from a soundscape perspective, characterizing the sound source and evaluating it within relation to the environment and evaluating its subjective response. With the research carried out, it has been obtained, on the one hand, a complete study of the acoustics of the environment of the Water Tribunal and, on the other, an enhancement of the Valencian tangible and intangible heritage.
... The listeners' intelligibility scores diminishing in the seminar room (5 m) to almost flooring at chapel (5 m) (3.3 dB) showed that neither clarity nor reverberation time was adequate to predict speech intelligibility by themselves. While C50 has been used as an objective measure of speech intelligibility in general (Bradley et al., 1999;Pulkki and Karjalainen, 2015), a reduction of C50 of 7.9 dB between the seminar room (2 m) and chapel (2 m) in the current study did not affect how the listeners understood the speech apart from at angles -135 • and -90 • . The results only corroborated our hypothesis of more adverse room acoustics causing an increased detrimental effect to the listeners' benefit from SRM at the two 5 m conditions, seminar room (5 m) and chapel (5 m). ...
Article
Full-text available
It is well-known that we benefit from binaural hearing when listening to the speech of interest amongst noise, where spatial cues may release our auditory perception from masking. However, this benefit deteriorates with external factors such as the reverberation in the room, as well as internal factors such as our familiarity with the language of interest. The current study examined spatial release from masking (SRM) experienced by listeners with different age of immersion to New Zealand English (NZE) in varying room acoustics. A speech intelligibility test was conducted using an Ambisonic-based sound reproduction system to reproduce speech and noise as if they were produced in a seminar room and a chapel at two distances between the source and the listener: 2 m and 5 m. The rooms differed in reverberation time (RT), and the distances modified the speech clarity (C50). The participants were split into an early immersed group (n=20), and a late immersed group (n=37), where the participants in the early immersed group were immersed in NZE before the age of 13, and those in the late immersed group were immersed after the age of 15. A babble noise was played from eight azimuthal angles (0, ±45∘, ±90∘, ±135∘, 180∘) while the target speech, which was sentences from the Bamford-Kowal-Bench (BKB) corpus, was played from 0∘. Early immersed listeners were able to understand speech better than late immersed listeners within most of the room acoustics tested. However, once the room acoustics caused too adverse listening conditions at a high RT and low C50, neither group could benefit from SRM. The early immersed group was also able to make use of spatial cues to benefit from SRM more than the late immersed group, even in the least reverberant room scenario in the current study. Finally, while room acoustics affected how the groups benefitted from SRM, this effect was only observed when the source was located 5 m from the listener.
... Of course, it also differs among people. In experiences of acoustic perception in rooms, the JND of some parameters is usually within 1-2 dB [33,34]. Even the accuracy of a type I sound level meter is between ± 2 dB from low (25 Hz) to high (10 kHz) frequencies [35]. ...
Article
Full-text available
Whereas noise generated by road traffic is an important factor in urban pollution, little attention has been paid to this issue in the field of hydrogen-fueled vehicles. The objective of this study is to analyze the influence of the type of fuel (gasoline or hydrogen) on the sound levels produced by a vehicle with an internal combustion engine. A Volkswagen Polo 1.4 vehicle adapted for its bi-fuel hydrogen-gasoline operation has been used. Tests were carried out with the vehicle when stationary to eliminate rolling and aerodynamic noise. Acoustics and psychoacoustics levels were measured both inside and outside the vehicle. A slight increase in the noise level has only been found outside when using hydrogen as fuel, compared to gasoline. The increase is statistically significant , can be quantified between 1.1 and 1.7 dBA and is mainly due to an intensification of the 500 Hz band. Loudness is also higher outside the vehicle (between 2 and 4 sones) when the fuel is hydrogen. Differences in sharpness and roughness values are lower than the just-noticeable difference (JND) values of the parameters. Higher noise levels produced by hydrogen can be attributed to its higher reactivity compared to gasoline.
... Furthermore, for characterizing noise and room acoustic measurements in such institutions, two possibilities include using omnidirectional and binaural transducers. The former allows a range of measurements including standardized ones (Bradley et al., 1999;American National Standards Institute, 2002;Building Bulletin 93, 2015;Deutsches Institut für Normung, 2016;Astolfi et al., 2019a), while the latter generally incorporated as microphones near human (who may or may not have freedom of movement) ears or within ear canals of head and torso simulator (HATS). Binaural transducers allow measurements that can represent some of the effects of head, shoulders, and outer ear processing for static listeners (i.e., without head movements). ...
Article
Full-text available
Children spend a considerable amount of time in educational institutions, where they are constantly exposed to noisy sound environments, which has detrimental effects on children's health and cognitive development. Extensive room acoustics measurements and long-term in-situ measurements in such institutions are scarce and are generally conducted using omnidirectional microphones. This study provides preliminary results of room acoustics in unoccupied conditions and in-situ noise measurements during occupancy, in classrooms and playrooms in Germany using an omnidirectional microphone, an adult HATS (head and torso simulator), and a child HATS. The results indicate that room acoustics of most of the sampled rooms need improvement (mid-frequency reverberation time, T 30 (s) = 0.6 (0.3-1.1) and clarity index, C 50 (dB) = 6.1 (1.6-10.4); speech transmission index (STI) = 0.7 (0.6-0.8); mean values and range); the sound pressure level (SPL) during activities was around 66 dB (A-weighted equivalent level SPL) in both classrooms and playrooms using omnidirectional measurements, which is somewhat lower than similar measurements in other countries that varied in measurement periods; psychoacoustics parameters relating to sound fluctuation (fluctuation strength and roughness) show variation with increasing room volumes; and that there may be some benefit in considering child HATS for in-situ noise measurements. While the validity of these results in relation to children's perceptual evaluation (using questionnaires, etc.) is subject to future investigations, the results highlight some of the nuances in the choice of transducers in measurements with children and potential benefits of psychoacoustic parameters in complementing the SPL-based parameters in more comprehensively characterizing the noise environments in educational institutions.
... Tables 2 and 3 respectively present comparisons of e EDT and e C among the 4th-E TD-FEM, the Opt-E TD-FEMs with four optimized settings, 4th-I TD-FEM, and the 2nd-I TD-FEM. Herein, the results below 1 kHz were omitted because the error magnitude was smaller than the JND values for EDT [70] and C 50 [71], i.e., 5% for EDT and 1.1 dB for C 50 . Only the two Opt-E TD-FEMs optimized at R = 6.25 and R = 5.5 produce smaller errors than JND values for both room acoustic parameters at all frequencies. ...
Article
Full-text available
Wave-based acoustics simulation methods such as finite element method (FEM) are reliable computer simulation tools for predicting acoustics in architectural spaces. Nevertheless, their application to practical room acoustics design is difficult because of their high computational costs. Therefore, we propose herein a parallel wave-based acoustics simulation method using dissipation-free and dispersion-optimized explicit time-domain FEM (TD-FEM) for simulating room acoustics at large-scale scenes. It can model sound absorbers with locally reacting frequency-dependent impedance boundary conditions (BCs). The method can use domain decomposition method (DDM)-based parallel computing to compute acoustics in large rooms at kilohertz frequencies. After validation studies of the proposed method via impedance tube and small cubic room problems including frequency-dependent impedance BCs of two porous type sound absorbers and a Helmholtz type sound absorber, the efficiency of the method against two implicit TD-FEMs was assessed. Faster computations and equivalent accuracy were achieved. Finally, acoustics simulation of an auditorium of 2271 m3 presenting a problem size of about 150,000,000 degrees of freedom demonstrated the practicality of the DDM-based parallel solver. Using 512 CPU cores on a parallel computer system, the proposed parallel solver can compute impulse responses with 3 s time length, including frequency components up to 3 kHz within 9000 s.
... The smallest STI is 0.64, measured at the R 27 receiver position in CL5, and the largest STI is 0.88, measured at the R 7 receiver position in CL2 and R 13 receiver position in CL3. The average STI of the 40 receiver positions is 0.48 before the renovation, and 0.74 after the renovation, indicating an increase of 0.26, i.e., approximately 9 just noticeable differences (JNDs, the JND of STI is 0.03 (Bradley et al. 1999a)). Similar to C 50 , the STI difference between different receiver positions in a Fig. 6 The RTs of octave bands 125-4000 Hz for each receiver position before and after the renovation classroom is also obvious, indicating the sensitivity of the STI to changes in the architectural acoustic conditions. ...
Article
The acoustic environment of the classroom is one of the most important factors influencing the teaching and learning effects of the teacher and students. It is critical to ensure good speech intelligibility in classrooms. However, due to some factors, it may not be easy to achieve an ideal classroom acoustic environment, especially in large-scale multimedia classrooms. In a real renovation project of 39 multimedia classrooms in a university, seven typical rooms were selected, and the acoustic environment optimisation design and verification for these multimedia classrooms were performed based on simulation. First, the acoustic and sound reinforcement design schemes were determined based on the room acoustics software ODEON. Next, the effects of the optimisation design were analysed, and the simulated and measured results were compared; the accuracy of using the reduced sound absorption coefficients, which were determined empirically, was also examined. Finally, the recommended reverberation times (RTs) in multimedia classrooms corresponding to speech intelligibility were discussed, the effectiveness of the speech transmission index (STI) as a primary parameter for classroom acoustic environment control was considered, and the acoustic environment under the unoccupied and occupied statuses was compared. The results revealed that although there are many factors influencing the effect of classroom acoustic environment control, an adequate result can be expected on applying the appropriate method. Considering both the acoustic design and visual requirements also makes the classroom likely to have a good visual effect in addition to having a good listening environment.
... When considering an early-to-late ratio for a limit of 50 ms, consistent with the definition of speech clarity C50, the difference between measurements and generated impulse responses is around 1 dB. For speech signals, the just-noticeable difference of C50 is around 1.1 dB, and a significant change value is considered to be around 3 dB [33]. This value is superior to the minimal early energy difference observed between the measured impulse responses and those generated by the envelope-based model. ...
Article
Full-text available
Audio-only augmented reality consists of enhancing a real environment with virtual sound events. A seamless integration of the virtual events within the environment requires processing them with artificial spatialization and reverberation effects that simulate the acoustic properties of the room. However, in augmented reality, the visual and acoustic environment of the listener may not be fully mastered. This study aims to gain some insight into the acoustic cues (intensity and reverberation) that are used by the listener to form an auditory distance judgment, and to observe if these strategies can be influenced by the listener’s environment. To do so, we present a perceptual evaluation of two distance-rendering models informed by a measured Spatial Room Impulse Response. The choice of the rendering methods was made to design stimuli categories in which the availability and reproduction quality of acoustic cues are different. The proposed models have been evaluated in an online experiment gathering 108 participants who were asked to provide judgments of auditory distance about a stationary source. To evaluate the importance of environmental cues, participants had to describe the environment in which they were running the experiment, and more specifically the volume of the room and the distance to the wall they were facing. It could be shown that these context cues had a limited, but significant, influence on the perceived auditory distance.
... A difference of at least 2 dB was needed for the jury to perceive a difference [35]. In a study by Bradley and Reich, it was concluded that 3 dB is a relevant value as a just noticeable difference (JND) in rooms for speech [36]. In ISO 3382-1 [14], in performance spaces, the typical value for JND is 1 dB for C 50 . ...
Article
Full-text available
In environments such as classrooms and offices, complex tasks are performed. A satisfactory acoustic environment is critical for the performance of such tasks. To ensure a good acoustic environment, the right acoustic treatment must be used. The relation between different room acoustic treatments and how they affect speech perception in these types of rooms is not yet fully understood. In this study, speech perception was evaluated for three different configurations using absorbers and diffusers. Twenty-nine participants reported on their subjective experience of speech in respect of different configurations in different positions in a room. They judged sound quality and attributes related to speech perception. In addition, the jury members ranked the different acoustic environments. The subjective experience was related to the different room acoustic treatments and the room acoustic parameters of speech clarity, reverberation time and sound strength. It was found that people, on average, rated treatments with a high degree of absorption as best. This configuration had the highest speech clarity value and lowest values for reverberation time and sound strength. The perceived sound quality could be correlated to speech clarity, while attributes related to speech perception had the strongest association with reverberation time.
... When considering an early-to-late ratio for a limit of 50ms, consistent with the definition of speech clarity C50, the difference between measurements and generated impulse responses is around 1dB. For speech signals, the just-noticeable-differences of C50 are around 1.1 dB and a significant change value is considered to be around 3dB [24]. The early energy difference observed between the measured impulse responses and those generated by the envelope-based model is inferior to 3dB. ...
Article
No PDF available ABSTRACT Visual and acoustic environment may influence the perception of auditory distance. In the context of Audio-only augmented reality (AAR), the coherence of the perceived virtual sound sources with the apparent room geometry and acoustics cannot always be guaranteed. The perceptual consequences of these incoherences are not well known. We conducted two online perceptual studies with a sound distance rendering model based on measured spatial room impulse responses (SRIR). A first study evaluated the perceptual performances of the model in incongruent visual contexts. The incongruent environment-related visual cues (spatial visual boundary and room volume) demonstrated a significant effect on the auditory distance perception (ADP) of virtual sound sources, through a calibration effect. A second study evaluated the impact of acoustical incongruence. Virtual sound sources distances were judged after the participant listened to distracting sound sources conveying distance cues relative to a different acoustical environment. When this distracting sound sources corresponded to a larger room than the one reproduced by the model, a higher compression effect was observed on the ADP of virtual sound sources. However, when the intensity cue conveyed by the distracting sound sources were coherent with the acoustical environment simulated by the model, their distracting effect were negligible.
... whereas their just noticeable differences (JND), are between 1 and 3 dB [5], [6]. ...
Conference Paper
Full-text available
Early-to-late energy ratios (ELER) are used to quantify speech intelligibility and music clarity in acoustic spaces from measurements of omnidirectional room impulse responses (RIR). Nowadays, the capture of directional RIRs is possible with spherical microphone arrays and the spherical Fourier transform. These tools are thus motivating the enhancement of omnidirectional metrics and the search for new metrics to quantify directional features of sound. This research explores a directional metric of intelligibility and clarity based on ELERs of directional RIRs. The early-to-late transition times are chosen according to the content: 50 ms for speech and 80 ms for music. The proposed metrics can therefore be interpreted as directional versions of the standard clarity indexes of speech (C50) and music (C80). Directional RIRs were captured at many seats in a large auditorium using a first-order ambisonics microphone. Supporting acoustic simulations of a cuboid room with a second-order ambisonics microphone were also used. Directional ELERs were calculated in the octave bands within the operation range of the microphones. Three directional ELER patterns were identified: an omnidirectional pattern, a dipole pointing forward and backward, and a beam pointing towards the source.
... The United Kingdom's Building Bulletin 93 (2015) recommends a speech transmission index (STI) of 0.6 or higher. Bradley et al. (1999) published a regression equation that linked 0.6 STI with a C50 value of 1 dB, with every 0.1 increase in STI resulting in a 3 dB increase for C50. More recently, Italy released a national standard, UNI 11532, on classroom acoustics, which lists 2 dB as the minimum desired C50, averaged across the midfrequencies of 500 Hz, 1 kHz, and 2 kHz and averaged across measurement positions (Astolfi et al., 2019a). ...
Article
Full-text available
This project acquired sound levels logged across six school days and impulse responses in 220 classrooms across four K–12 grades. Seventy-four percent met reverberation time recommendations. Sound levels were processed to estimate occupied signal-to-noise ratios (SNRs), using Gaussian mixture modeling and from daily equivalent and statistical levels. A third method, k-means clustering, estimated SNR more precisely, separating data on nine dimensions into one group with high levels across speech frequencies and one without. The SNRs calculated as the daily difference between the average levels for the speech and non-speech clusters are found to be lower than 15 dB in 27.3% of the classrooms and differ from using the other two methods. The k-means data additionally indicate that speech occurred 30.5%–81.2% of the day, with statistically larger percentages found in grade 3 compared to higher grades. Speech levels exceeded 65 dBA 35% of the day, and non-speech levels exceeded 50 dBA 32% of the day, on average, with grades 3 and 8 experiencing speech levels exceeding 65 dBA statistically more often than the other two grades. Finally, classroom speech and non-speech levels were significantly correlated, with a 0.29 dBA increase in speech levels for every 1 dBA in non-speech levels.
... Further, it is important to consider the values in different positions and not only on an average level. Regarding the importance of good speech clarity, Bradley and Reich investigated [34] the JND for C 50 in rooms for speech, finding that 3 dB is more relevant for these types of rooms than the 1 dB value stated in the standard 3382-1 for performance spaces. However, JND for speech and in ordinary room is not fully understood. ...
Article
Full-text available
In ordinary public rooms absorbent ceilings are normally used. However, reflective material such as diffusers can also be useful to improve the acoustic performance for this type of environment. In this study, different combinations of absorbers and diffusers have been used. The study investigates whether a test group of 29 people perceived sound in an ordinary room differently depending on the type of treatment. Comparisons of the same position in a room for different configurations as well as different positions within one configuration were made. The subjective judgements were compared to the room acoustic measures T20, C50 and G and the difference in the values of these parameters. It was found that when evaluating the different positions in a room, the configuration including diffusers was perceived to a greater extent as being similar in the different positions in the room when compared to the configuration with absorbers on the walls. It was also seen that C50 was the parameter that mainly affected the perception, with the difference needing to be 2 dB to recognize a difference. However, the room acoustic measurements could not fully explain the differences obtained in perception. In addition, the subjective sound image created by different types of treatments was also shown to have an important impact on the perception.
Conference Paper
In this paper, the full modulation method of speech intelligibility index (STI) estimation and its modification in the form of the full formant-modulation (FM) method are compared in terms of measurement accuracy in conditions where the speech signal is masked by noise and reverberation. Dependences of STI estimation errors on signal-to-noise ratios and on the duration of test signals for the reverberation time typical for university auditoriums of 0.8 s were obtained. It is shown that the accuracy of STI estimation in the presence of reverberation practically does not depend on the choice of estimation method. The obtained results indicate that an acceptable for practical use error of 0.01-0.02 of STI estimation in the conditions of joint action of noise and reverberation can be ensured when using test signals lasting 8-16 s.
Article
Full-text available
In this study, the apparent variation ranges of acoustical parameters were investigated in a concert hall. The initial time delay gap (ITDG) was evaluated in terms of its just noticeable difference (JND) through two instruments, the cello and the trumpet. Even though the IDTG values were prolonged over the measurements and were not significantly varied, an ITDG range of 22-220 ms in increments of 91 steps was produced electro-acoustically in an anechoic chamber. The result of JNDc (∆gap/gap) was rated by 50% "different" judgement ranges for the cello and trumpet tracks, respectively. The effective duration of the autocorrection function (ACF) of the continuous brainwaves (CBWs) within the alpha (8-13 Hz) frequency range in the left hemisphere responding to 91-step ITDG increments revealed that the continuous ratios of τe_min ((τe_min_rear − τe_min_ front)/τe_min_front) of CBWs were constantly on the trumpet. Furthermore, a homologous period of resonance between the subjective JNDc and τe values of the ACF of CBWs in the alpha range allowed us to conclude that the subjective JND of the ITDG in a room was related to the W_ IACC value of the interaural cross-correlation function, which reflected the characteristics of source signals themselves and aroused the activities of the brain in the right hemisphere (p < 0.01). The dry sources of sound stimuli were first used to link the psychological preference and the neurophysiological activation of the room acoustics.
Article
The listener head orientation significantly affects speech intelligibility (SI) in the small acoustic space of an automobile, owing to near-field binaural listening and special acoustic conditions such as early reflections and seat-back occlusions. However, situations involving a back-row listener and a front-row speaker (i.e., front-to-rear scenarios) have not yet been studied. Such scenarios are dominated by non-uniform early reflections rather than direct sounds, making them distinct from cases considered in previous studies. This study investigates the effect of listener head orientation on SI in front-to-rear scenarios in an automobile, using both the speech transmission index (STI) and subjective experiments. A virtual speaker (i.e., a loudspeaker) is located in the driver’s seat and faces forward. Binaural room impulse responses are measured on a dummy head in the three rear-row seats with different head orientations. Using the binaural room impulse responses, the corresponding STIs are calculated, and the Chinese SI scores are measured virtually via headphones. The results show that the SI variation range for various listener head orientations in front-to-rear scenarios is greatly reduced compared with that in the rear-to-front scenarios considered in previous studies. Overall, the effect of listener head orientation on SI is mostly negligible when a listener in the rear row listens to a speaker in the front, with variations in the SI score of no more than 5%. The present study provides a supplement to previous studies and helps to deepen understanding of the effect of multiple factors on SI in an automotive cabin.
Article
Previously proposed methods for estimating acoustic parameters from reverberant, noisy speech signals exhibit insufficient performance under changing acoustic conditions. A data-centric approach is proposed to overcome the limiting assumption of fixed source-receiver transmission paths. The obtained solution significantly enlarges the scope of potential applications for such estimators. The joint estimation of reverberation time RT60 and clarity index C50 in multiple frequency bands is studied with a focus on dynamic acoustic environments. Three different convolutional recurrent neural network architectures are considered to solve the tasks of single-band, multi-band, and multi-task parameter estimation. A comprehensive performance evaluation is provided that highlights the benefits of the proposed approach.
Article
Full-text available
The speech transmission index (STI) and room acoustic parameters (RAPs) are essential metrics for assessing speech quality and predicting listening difficulty in a sound field. Although STI and important RAPs, such as reverberation time and clarity, can be derived from the room impulse response (RIR), measuring the RIR in regularly occupied spaces is difficult. Hence, simultaneous blind estimation of STI and RAPs is an imperative challenge issue that must be addressed. However, most existing methods provide only a single parameter and require a massive dataset for model training. A deterministic method is presented for blindly estimating STI and five RAPs using a stochastic RIR model that approximates an unknown RIR. An algorithm is formulated that uses the temporal power envelope of a reverberant speech signal to determine the optimal parameters of the RIR model. A mathematical model of reverberation and dereverabation process was proposed is based on the temporal power envelope of the signals. This model maps the parameters of the RIR model to the observed reverberant signal. The estimated RIR can then be synthesized using the optimal parameters to estimate the STI and RAPs. A simulation was conducted to evaluate the simultaneous estimation of STI and five essential RAPs from observed reverberant speech signals, in comparison to the best existing previous work. The root-mean-square error (RMSE) and Pearson correlation coefficient between the estimated and measured values were used as evaluation metrics. In terms of STI, the proposed method achieves the best accuracy with an RMSE of 0.037. With regard to the reverberation time and other RAPs, the accuracy remains consistent with the previous works. The results show that the proposed method can effectively estimate STI and RAPs simultaneously without any training.
Article
Today, millions of standardized English as a foreign language proficiency tests are administered globally each year. A large portion of this is conducted as a paper-based test in which the listening section is commonly delivered through loudspeakers to groups of test takers, a method in which the audio signals are exposed to the acoustic tendencies of each particular venue. As it is well-established in the literature that non-native listeners are more susceptible to adverse listening conditions compared to their native counterparts, there is a need for an objective examination of the acoustic quality of such environments. This study examined the speech transmission index for public address systems (STIPA) for three types of sound sources (wall-mounted speakers, radio cassette player, and amplified speaker) and reverberation time (RT) in 10 unoccupied classrooms commonly used as test rooms at a university in Japan. The results revealed that STI was found to be statistically significantly different for the amplified speaker compared to both or one other sound source in eight out of 10 rooms. The amplified speaker also recorded the highest STI among the three sound sources in eight out of 10 rooms and the most rooms with STI entirely above 0.66, a minimum target value prescribed in IEC 60268–16:2020 as exhibiting high speech intelligibility. Additionally, ≥ 0.66 STI was consistently observed in rooms with RT0.5-2kHz ≤ 0.7 s. Further observations are discussed to better understand the current conditions under which these tests are administered
Article
Full-text available
Resumo Salas de estudos são ambientes coletivos do tipo panorâmico, os quais possuem diversas estações de trabalho separadas ou não por divisórias. Neste trabalho, são analisadas a interferência na inteligibilidade da fala e as condições de conforto acústico em salas de estudos coletivas devido ao ruído produzido por miniventiladores de mesa (de plástico e metal) e ar-condicionado (split e de janela). Seis salas de estudos coletivas foram avaliadas em 11 configurações, variando o uso do equipamento de resfriamento, por meio das curvas de ruído NC e RC Mark II, tempo de reverberação (TR), tempo de decaimento inicial (EDT), índice de transmissão da fala (STI) e definição (D50). Os resultados da curva de ruído demonstram que todas as situações analisadas com osminiventiladoresatendem a NC40 (45 dB) para escritórios coletivos, o que não ocorreu com o uso dosaparelhos de ar-condicionado. Todas as configurações consideradas com os equipamentos de resfriamento apresentaram curva RC Mark II com característica de chiado (espectro desbalanceado em alta frequência). Além disso, observa-se que as condições físicas das salas, destacando-se a pouca área de absorção sonora, geram valores de parâmetros acústicos considerados impróprios para atividades de estudos coletivas, sendo as salas com maior volume os piores resultados.
Thesis
Full-text available
Regardless of the field, measurements are essential for validating theories and making well-founded decisions. A criterion for the validity and comparability of measured values is their uncertainty. Still, in room acoustical measurements, the application of established rules to interpret uncertainties in measurement is not yet widespread. This raises the question of the validity and interpretability of room acoustical measurements. This work discusses the uncertainties in measuring room acoustical single-number quantities that complies with the framework of the ``Guide to the Expression of Uncertainty in Measurement''(GUM). Starting point is a structured search of variables that potentially influence the measurement of room impulse responses. In a second step, this uncertainty is propagated through the algorithm that determines single-number quantities. A second emphasis is placed on the investigation of spatial fluctuations of the sound field in auditoria. The spatial variance of the sound field in combination with an uncertain measurement position marks a major contribution to the overall measurement uncertainty. To reach general conclusions, the relation between changes in the measurement location and the corresponding changes in measured room acoustical quantities is investigated empirically in extensive measurement series. This study shows how precisely a measurement position must be defined to ensure a given uncertainty of room acoustical single-number quantities. The presented methods form a foundation that can be exibly extended in future investigations to include additional influences on the measurement uncertainty.
Article
Artificial reverberation (AR) models play a central role in various audio applications. Therefore, estimating the AR model parameters (ARPs) of a reference reverberation is a crucial task. Although a few recent deep-learning-based approaches have shown promising performance, their non-end-to-end training scheme prevents them from fully exploiting the potential of deep neural networks. This motivates the introduction of differentiable artificial reverberation (DAR) models, allowing loss gradients to be back-propagated end-to-end. However, implementing the AR models with their difference equations “as is” in the deep learning framework severely bottlenecks the training speed when executed with a parallel processor like GPU due to their infinite impulse response (IIR) components. We tackle this problem by replacing the IIR filters with finite impulse response (FIR) approximations with the frequency-sampling method. Using this technique, we implement three DAR models—differentiable Filtered Velvet Noise (FVN), Advanced Filtered Velvet Noise (AFVN), and Delay Network (DN). For each AR model, we train its ARP estimation networks for analysis-synthesis (RIR-to-ARP) and blind estimation (reverberant-speech-to-ARP) task in an end-to-end manner with its DAR model counterpart. Experiment results show that the proposed method achieves consistent performance improvement over the non-end-to-end approaches in both objective metrics and subjective listening test results.
Article
Full-text available
In ordinary public rooms, such as classrooms and offices, an absorbent ceiling is the typical first acoustic action. This treatment provides a good acoustic baseline. However, an improvement of specific room acoustic parameters, operating for specific frequencies, can be needed. It has been seen that diffusing elements can be effective additional treatment. In order to choose the right design, placement, and quantity of diffusers, a model to estimate the effect on the acoustics is necessary. This study evaluated whether an SEA model could be used for that purpose, particularly for the cases where diffusers are used in combination with an absorbent ceiling. It was investigated whether the model could handle different quantities of diffusing elements, varied diffusion characteristics, and varied installation patterns. It was found that the model was sensitive to these changes, given that the output from the model in terms of acoustic properties will be reflected by the change of diffuser configuration design. It was also seen that the absorption and scattering of the diffusers could be quantified in a laboratory environment: a reverberation chamber. Through the SEA model, these quantities could be transformed to a full-scale room for estimation of the room acoustic parameters.
Article
The speech transmission index (STI) and room-acoustic parameters, such as the reverberation time and clarity index, are essential to assess the quality of room acoustics. However, in everyday spaces occupied by people, it is difficult to obtain such parameters since the room impulse response (RIR) cannot be measured. Blind estimation of room acoustic parameters from observed signals without measuring RIR is therefore necessary. Although existing methods can estimate one of these parameters, a single parameter is inadequate to describe comprehensive subjective aspects. To this end, this paper proposes a method for estimating STI and five room-acoustic parameters simultaneously. The temporal amplitude envelope of a reverberant speech signal is mapped to the parameters of an RIR model for each sub-band. Instead of using Schroeder’s RIR model, the extended RIR model is used to approximate an unknown RIR so that the STI and room-acoustic parameters can be derived. We performed simulations to evaluate the proposed method in unseen reverberant environments. The root-mean-square errors between the ground-truths and estimated parameters suggest that the accuracy of the proposed scheme outperformed that with Schroeder’s RIR model and was close to the standard measurements.
Article
Full-text available
Introduction: Quadric surfaces are commonly used in buildings due to their geometric ability to distribute and focus sound waves. The Central Hall in Palau Güell — a UNESCO World Heritage Site — is topped by an ellipsoidal dome. Antoni Gaudí envisaged this room as a concert hall where the organ and the dome play a lead role. Methods: The two previously mentioned elements are the main subject of our paper, which serves two purposes: 1) determining the values of the acoustic parameters of the hall through onsite measurement and also through simulation, and 2) using the geometric parameters of the quadric surface, which best fits the dome, in order to check whether it is possible to improve the acoustics of the hall by placing a new emission source at the focus of the dome’s ellipsoid. Results and Discussion: Contrary to the authors’ expectations, due to the focal reflection properties of the quadric surface, some acoustic parameters on the listening plane do not improve significantly. Therefore, we conclude that Gaudí took the acoustical impact into account when designing this hall.
Preprint
Full-text available
We propose differentiable artificial reverberation (DAR), a family of artificial reverberation (AR) models implemented in a deep learning framework. Combined with the modern deep neural networks (DNNs), the differentiable structure of DAR allows training loss gradients to be back-propagated in an end-to-end manner. Most of the AR models bottleneck training speed when implemented "as is" in the time domain and executed with a parallel processor like GPU due to their infinite impulse response (IIR) filter components. We tackle this by further developing a recently proposed acceleration technique, which borrows the frequency-sampling method (FSM). With the proposed DAR models, we aim to solve an artificial reverberation parameter (ARP) estimation task in a unified approach. We design an ARP estimation network applicable to both analysis-synthesis (RIR-to-ARP) and blind estimation (reverberant-speech-to-ARP) tasks. And using different DAR models only requires slightly a different decoder configuration. This way, the proposed DAR framework overcomes the previous methods' limitations of task-dependency and AR-model-dependency.
Article
We introduce an A-weighting variance measurement, an objective estimation of the sound quality generated by geometric acoustic methods. Unlike the previous measurement, which applies to the impulse response, our measurement establishes the relationship between the impulse response and the auralized sound that the user hears. We also develop interactive methods to evaluate the measurement at run time and an adaptive algorithm that balances quality and performance based on the measurement. Experiments show that our method is more efficient in a wide variety of scene geometry, input sound, reverberation, and path tracing strategies.
Article
Full-text available
The sensitivity of listeners to small changes in the early sound field of auditoria has been measured. This was done using a realistic artificial simulation system of a concert hall sound field. The simulation system was designed so that standard objective parameter values were typical of those found in real halls, and within ranges known to be subjectively preferred. The following difference limens were measured: early lateral energy fraction; inter-aural cross correlation coefficient; clarity index and centre time. From these results it was shown that when changes are made to the early sound field, changes in perceived spatial impression will usually be larger than those for clarity. Furthermore it was found that acousticians can gain most by paying attention to lateral sound. Other measurements showed that: (i) the initial time delay gap is not significant to listener preference, and (ii) diffuse reflections in the early sound field are not perceived differently from specular reflections.ZusammenfassungEs wurde die Empfindlichkeit der Zuhörer auf kleine Änderungen im ,,frühen Schallfeld“ von Auditorien gemessen. Hierzu wurde ein realistisches Simulations-system für ein Schallfeld in einem Konzertsaal benutzt. Dieses Simulationssystem war so eingerichtet, daß die Werte der üblichen objektiven Parameter typisch für reale Hallen waren und in Bereichen lagen, die subjektiv bevorzugt werden. Es wurden Differenzschwellen für die folgenden Größen gemessen: Das Seitenschallverhältnis, der interaurale Kreuzkorrelationskoeffizient, der Klarheitsindex und die Schwerpunktszeit. Mit diesen Resultaten wurde gezeigt, daß bei Änderungen am ,,frühen Schallfeld“ die Veränderungen des wahrgenommenen Räumlichkeitseindrucks gewöhnlich größer sind als diejenigen der Klarheit. Weiterhin wurde gefunden, daß es für Akustiker am vorteilhaftesten ist, ihr Augenmerk auf den Seitenschall zu richten. Andere Messungen zeigten, daß: 1. Die Anfangsverzögerung (initial time delay gap) für den Höreindruck nicht significant ist und 2. diffuse Reflexionen im frühen Schallfeld nicht anders als geometrische Reflexionen wahrgenommen werden.SommaireNous avons mesuré la sensibilité différentielle d'auditeurs à des modifications du champ acoustique précoce, à l'aide d'une simulation réaliste en chambre sourde du champ acoustique d'une salle de concert. Le système de simulation a été conçu de façon a produire des valeurs des paramètres objectifs typiques de celles que l'on trouve dans les salles réelles, et dans les limites que l'on sait correspondre aux préférences subjectives. Les seuils différentiels mesurés concernaient: la portion d'énergie latérale précoce, le coefficient de corrélation interaurale, la clarté et le temps central. Les résultats montrent que les modifications du champ précoce affectent advantage l'impression spatiale que la clarté. De plus on a trouvé que la caractéristique acoustique la plus importante est le son latéral. D'autres mesures ont montré que: 1) le retard de la première réflexion n'a pas d'effet significatif sur la préférence de l'auditeur, et 2) que les réflexions diffuses du champ précoce ne sont pas perçues différemment des réflexions spéculates.
Article
Full-text available
Speech intelligibility in rooms is determined by both room acoustics characteristics as well as speech-to-noise ratios. These two types of effects are combined in measures such as useful-to-detrimental sound ratios which are directly related to speech intelligibility. This paper reports investigations of optimum acoustical conditions for classrooms using the ODEON room acoustics computer model. By determining conditions that relate to maximum useful-to-detrimental sound ratios, optimum conditions for speech are determined. The results show that an optimum mid-frequency reverberation time for a classroom is approximately 0.5 s, but speech intelligibility is not very sensitive to small deviations from this optimum. Speech intelligibility is influenced more strongly by ambient noise levels. The optimum location of sound absorbing material was found to be on the upper parts of the walls.
Article
A study is presented on the detectability of differences of the definition of speech and of the clarity of music presented in listening rooms. To validate earlier results obtained using a simulated acoustic field, audibility tests were performed using a dummy head in an actual room. The values obtained in these tests for the limens of perception are in good agreement with those obtained using simulated fields, bearing in mind the uncertainties of subjective judgments. The investigations in an actual room thus confirm precisely the auditory tests conducted with synthetic signals. Looking for a practical application of the results as an indicator for determining the optimal dimensions of the `seating zone' of a room; we propose ΔC80±2.5 dB and ΔC80±3.0 dB as the limens of `just perceptible differences', where the quantities ΔC50 and ΔC80 represent the maximum permissible acceptable values of the differences in the measures of definition for speech and the measures of clarity for music.
Article
The early-reflection portion of the sound decay process is of particular importance since it is most responsible for subjective impression. Curves showing early-to-late sound energy ratios (ELR) in the early-reflection period can be helpful in analyzing energy-time characteristics within that period. A measurement and analysis process is described which employs energy-time curves (ETC) and corresponding displays of early-to-late sound energy ratio values in the early-reflection period between 20 and 200 ms. Speech-weighted C50 values for quantifying speech intelligibility and frequency-averaged C80 values for quantifying music clarity are among the objective measures provided by the ELR procedure.
Article
A study is presented on the detectability of differences of the definition of speech and of the clarity of music presented in listening rooms. To validate earlier results obtained using a simulated acoustic field, audibility tests were performed using a dummy head in an actual room. The values obtained in these tests for the limens of perception are in good agreement with those obtained using simulated fields, bearing in mind the uncertainties of subjective judgments. The investigations in an actual room thus confirm precisely the auditory tests conducted with synthetic signals.Looking for a practical application of the results as an indicator for determining the optimal dimensions of the “seating zone” of a room, we propose ΔC 80 ≈ ± 2.S dB and ΔC 80 ≈ ± 3.0 dB as the limens of “just perceptible differences”, where the quantities ΔC 50 and ΔC 80 represent the maximum permissible acceptable values of the differences in the measures of definition for speech and the measures of clarity for music.Zusammenfassung Es wird über Untersuchungen zur subjektiven Erkennbarkeit von Unterschieden der Deutlichkeit von Sprache und der Durchsichtigkeit von Musik in Zuhörerröumen berichtet. Zur Absicherung der vorher in einem synthetischen Schallfeld gewonnenen Erkenntnisse wurden Hörversuche mit Kunstkopfaufnahmen aus einem realen Saal durchgeführt.Die aus den Ergebnissen dieser Hörversuche ermittelten Richtwerte für die Wahrnehmbarkeitsgrenzen stimmen – im Rahmen der Unsicherheiten bei der subjektiven Urteilsfindung - mit den im synthetischen Schallfeld gefundenen Werten sehr gut überein. Die Untersuchungen im realen Saal bestätigen damit die Ergebnisse der Hörversuche im synthetischen Schallfeld und präzisieren diese.Für die praktische Anwendung der Ergebnisse, etwa als Richtwerte fur die Größe der för die Optimierung eines Raumes zu definierenden Platzzonen, wird die Grenze für “geringe Unterschiedswahrnehmung” mit ΔC 50 ≈ ± 2,5 dB und ΔC 80 ≈ ± 3,0 dB empfohlen, wobei die Größen ΔC 50 und ΔC 80 die maximal zulässigen Unterschiede des Deutlichkeitsmaßes für Sprache und des Klarheitsmaßes für Musik bedeuten.Sommaire Nous présentons une étude sur la détectabilité de différences de netteté de la parole et de clarté de la musique présentées dans des salles d' écoute. Pour confirmer les connaissances acquises préalablement sur des signaux synthétiques, les tests auditifs ont cette fois été faits sur des enregistrements par tête artificielle effectués dans une salle réelle. Les valeurs obtenues dans ces tests pour les seuils de perceptibilité sont en bon accord avec celles qu'avaient données les mesures aux signaux de synthèse, compte tenu de l'incertitude des jugements subjectifs. Les recherches sur salle réelle confirment done et précisent les résultats des tests auditifs conduits sur des signaux synthétiques.En vue d'une application pratique des résultats en tant qu'indicateur pour déterminer les dimensions optimales de la «zone assise» d'une salle, nous proposons de fixer à ΔC 50≈ ± 2,5 dB et ΔC 80 ≈ ± 3,0 dB les seuils de «faible perceptibilityé des differences», où les grandeurs ΔC 50 et ΔC 80 représented les valeurs maximales acceptables des différences de mesures de definition pour la parole et de mesures de clarté pour la musique.
Article
The Signal-to-Noise Ratio devised by Lochner and Burger contributed an objective design index for predicting speech intelligibility. Their index provided a measure of useful and detrimental reflected speech energy according to the integration and masking characteristics of hearing, and enabled predictions to be made from impulse measurements in models. However, it was found necessary to extend the Signal-to-Noise Ratio theory to account for the effect of fluctuating ambient background noise on speech intelligibility. A modified Signal-to-Noise Ratio was derived from a best-fitting empirical correlation with speech intelligibility in a series of measurements in existing auditoria. In the modified Signal-to-Noise Ratio ambient background noise is no longer considered in terms of its steady state characteristics but more specifically in terms of its transient and spectral characteristics given by the concept of the L10 PNC level. The index has been applied as design criteria to prediction and to evaluation techniques.
Article
An observer in an auditorium receives first the direct sound from the source and after that a large number of reflections from the different surfaces in the enclosure. This very intricate sound pattern is analysed by the hearing system and gives rise to those acoustical qualities normally attributed to the auditorium.The present article summarizes the work carried out in this laboratory with the object of throwing more light on the interpretation, by the hearing mechanism, of reflection patterns in auditoria and the application of these principles to the design of auditoria.
Article
The speech transmission index, the useful-to-detrimental ratio, and the percent articulation loss of consonants are three quite different types of measures of speech intelligibility in rooms. They each combine a measure of the speech-to-noise ratio and a measure of the room acoustics to better relate to speech intelligibility in rooms. Values of all three types of measures were calculated from 91 room impulse responses obtained from a wide range of acoustical conditions, and for different speech-to-noise ratios. Several forms of these measures are shown to be reasonably well related to each other. The calculated regression equations relating the various measures of speech intelligibility permit practical conversions among the measures.
Article
Details of a procedure for efficiently and accurately calculating various auditorium acoustics measures from pistol shots are described. Early/late sound ratios, decay times, center time, and modulation transfer functions are calculated in octave bands. Measurements for a wide range of acoustical conditions are compared with simple predictions, and various quantities are found to be strongly intercorrelated. Ce document décrit en détail une méthode de calcul précise de diverses mesures acoustiques de tir au pistolet dans des auditoriums. Les rapports du son initial/final, les durées d'amortissement, le temps central, et les fonctions de transfert de modulation sont calculés par octaves. Les mesures relevées dans des conditions acoustiques très diverses sont comparées aux simples prédictions et l'on constate que certaines valeurs trouvées sont fortement liées les unes aux autres. RES
Article
Speech intelligibility tests and acoustical measurements were made in ten occupied classrooms. Octave-band measurements of background noise levels, early decay times, and reverberation times, as well as various early/late sound ratios, and the center time were obtained. Various octave-band useful/detrimental ratios were calculated along with the speech transmission index. The interrelationships of these measures were considered to evaluate which were most appropriate in classrooms, and the best predictors of speech intelligibility scores were identified. From these results ideal design goals for acoustical conditions for classrooms were determined either in terms of the 50-ms useful/detrimental ratios or from combinations of the reverberation time and background noise level.
Article
Three different types of acoustical measures were compared as predictors of speech intelligibility in rooms of varied size and acoustical conditions. These included signal-to-noise measures, the speech transmission index derived from modulation transfer functions, and useful/detrimental sound ratios obtained from early/late sound ratios, speech, and background levels. The most successful forms of each type of measure were of similar prediction accuracy, but the useful/detrimental ratios based on a 0.08-s early time interval were most accurate. Several physical measures, although based on very different calculation procedures, were quite strongly related to each other.