Article

On the combined effects of signal-to-noise ratio and room acoustics on speech intelligibility

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Speech intelligibility in rooms is influenced by room acoustics effects and by the signal-to-noise ratio (S/N) of the speech and ambient noise. Several measures such as useful-to-detrimental sound ratios and the speech transmission index predict the combined effects of both types of factors. These measures were evaluated relative to speech intelligibility test results obtained in simulated sound fields. The use of simulated sound fields made it possible to create the full range of combinations of room acoustics and S/N effects likely to be found in rooms for speech. The S/N aspect is shown to be much more important than room acoustics effects and new broadband useful-to-detrimental ratios were validated. Useful-to-detrimental ratios, speech transmission index measures, and values of the articulation loss for consonants were all reasonably accurate predictors of speech intelligibility. Further improvements to these combined measures are suggested.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In terms of Early Decay Time (EDT), as shown in Figure 9b, it was found to be very similar between the stalls and balcony. Between 500 Hz and 4 kHz, the values fluctuate around 0.4 s, while at lower frequencies, the results rise to 0.6 s, and at higher octaves, the The graph of Figure 10a indicates that the clarity index related to music (C80) for both the stalls and balcony has been found to be too high in all octaves, and according to the standard, C80 should be at most 1-2 dB for good musical performance [14]. This result negates the suitability of the Monte Castello di Vibio theatre for opera and orchestral music. ...
... The values of Definition (D50), as shown in the graph of Figure 11, are fluctuating between 0.75 and 0.85 at low and mid frequencies and rise to 0.9 at high-frequency bands, meaning that the speech definition is considered good inside the Monte Castello di Vibio theatre; values higher than 0.5 are preferable [16], considered very suitable for opera. The graph of Figure 10a indicates that the clarity index related to music (C80) for both the stalls and balcony has been found to be too high in all octaves, and according to the standard, C80 should be at most 1-2 dB for good musical performance [14]. This result negates the suitability of the Monte Castello di Vibio theatre for opera and orchestral music. ...
... The graph of Figure 10a indicates that the clarity index related to music (C 80 ) for both the stalls and balcony has been found to be too high in all octaves, and according to the standard, C 80 should be at most 1-2 dB for good musical performance [14]. This result negates the suitability of the Monte Castello di Vibio theatre for opera and orchestral music. ...
Article
Full-text available
The acoustic characteristics and spatial features of the world’s only surviving Italianate Gordonia-style miniature theatre, one of the smallest theatres in the world, have inspired the author to analyse the acoustic behaviour of the Monte Castello di Vibio theatre, also called “Teatro della Concordia”. In this paper, the geometric and architectural features of this historical and unique performing art space were first reproduced, considering that these features are essential factors affecting acoustic characteristics. Subsequently, the acoustic measurements were taken throughout the stall and inside some selected boxes, and their main parameters were acoustically characterised according to ISO 3382-1. Lastly, the main acoustic parameters of the Monte Castello di Vibio theatre were compared to those of the 1763 theatre in Bologna, which is also a miniature theatre of similar size. The aim is to explore the main influences on the acoustic parameters of miniature theatres, and the results show that the plan layout of the theatre and interior decoration are the main factors influencing the acoustic characteristics rather than volume. Preserving the acoustic features of this unique heritage building is also seen as one of the goals of this paper.
... As far as speech intelligibility is concerned, according to Formula (3) in Bradley (1986), 44 the recommended clarity for speech sounds, C50, 45 should be greater than 2 6 1 and 4 6 1 dB at mid frequencies for small classrooms with RTs of 0.8 and 0.6 s, respectively, and a 1-kHz useful-todetrimental ratio, U50, of 1.0 dB is recommended overall for a high level of speech intelligibility. Bradley et al. (1999) 46 showed that useful-to-detrimental ratios, speech transmission index measures, and values of the articulation loss of consonants are all accurate predictors of speech intelligibility. Moreover, they stated that U50, as obtained from measured C50, averaged over the four octaves from 500 Hz to 4 kHz, and A-weighted signal-tonoise ratio (SNR), as well as other variations of U50 with respect to the frequency averaging, are equivalent for assessing speech intelligibility. ...
... The useful-todetrimental ratio, U50 (dB), is defined as 10 time the logarithmic ratio of the early-arriving speech energy and the sum of the later-arriving speech energy and the ambient noise. It has been calculated according to Bradley et al. (1999) 46 from the measured C50, averaged over the four octaves from 500 Hz to 4 kHz, and from the A-weighted SNR. ...
... These classrooms should also implicitly guarantee U50 _M and U50 _ctr equal to or higher than 1 dB and L N_gr equal to or lower than 67 dBA. The C50 and U50 target values for moderate requirements agree with those obtained by Bradley,44,46 who recommended a C50 greater than 2 6 1 dB at mid frequencies for small and mediumsized classrooms with a RT of 0.8 s and a 1 kHz U50 optimum of 1 dB to ensure very good speech intelligibility. ...
Article
To promote a fast and effective characterization of the sound environment in small and medium-sized classrooms, a basic measurement protocol, based on a minimum number of parameters and positions, is provided. Measurements were taken in 29 occupied classrooms belonging to 13 primary schools in Turin, Italy, that differ in location and typology. The background noise level was acquired during silent and group activities, and the reverberation time, speech clarity, useful-to-detrimental ratio and speech level, were acquired along the main axis of each classroom and in one or two offset positions. To reduce the number of measured parameters that can be used to fully characterize classroom acoustics, data were divided into two groups on the basis of a cutoff value of maximum occupied reverberation time in the case of moderate and severe requirements. Given the strong correlation among the quantities, thresholds were identified for the other acoustical parameters, and their accuracy and precision were tested to assess their ability to classify the acoustic quality as compliant or non-compliant. Results suggest that more convenient parameters, like clarity in the central position of the classroom, can be used instead of reverberation time to classify classroom acoustics.
... Both Bradley and Hodgson [40][41][42][43][44] have carried out experimental and theoretical studies to investigate the relationship between background noise, reverberation time and speech intelligibility in classrooms. A general outcome is that acoustical condition in classroom should be designed based upon speech intelligibility and that noise is a highly critical factor in speech intelligibility. ...
... A general outcome is that acoustical condition in classroom should be designed based upon speech intelligibility and that noise is a highly critical factor in speech intelligibility. Bradley et al. [40] found that ambient noise, rather than reverberation, was the most significant factor in understanding speech, and that the most important parameter for speech intelligibility is the signal to noise ratio. As far as speech intelligibility and noise are concerned, Klatte et al. [10,12] found that speech perception was more impaired by train sound and energetic classroom noise (without speech) than by informational background speech, but the latter was more detrimental to highly demanding tasks like those based on listening comprehension. ...
... Guidelines for better learning performance and thresholds for worse performance are derived from the overall results, distinguishing between students <12 or ≥12 y.o. [40] and 0.05 according to Bork [68], and thus the above reference values correspond to different acoustic conditions. In the case of STI, it has been checked that all the reference values could be respected at the design or verification phases, particularly the ones referred to the STI for which the highest values, close to 1, are difficult to be guaranteed [39]. ...
Article
A review on the reference values of the acoustical parameters that have the greatest influence on students' performance at different ages has been completed in this study. Published studies from 2002 to 2020 were summarized, which focus on testing learning attainments and cognitive skills, speech intelligibility, and subjective perceptions under different classroom acoustic conditions. Only 38 papers out of the 56 containing empirical findings on the influence of acoustical parameters on students’ performance were considered, as the remaining 18 lacked ecological validity or did not respond to the selection criteria. Studies were only included if they considered normal hearing subjects and typical lesson settings provided in classrooms without an amplification system. Thus, the studies selected involved subjects between 5 and 40 years old. The values of the acoustical parameters that led to better or worse learning performance were tabulated and the distribution of occurrences was created. The median and interquartile range for the distribution of occurrences were used to describe the central tendency and dispersion and analyzed considering two different age groups. The results obtained were used to develop classroom acoustic guidelines for better learning performance for students under or over 12 years of age, and identified the thresholds that lead to worse performances.
... In the context of speech communication in a classroom, the first component, the transmission path of the signal between speakers and listeners is influenced by the signalto-noise ratio (SNR) and reverberation time of the room (Bradley et al., 1999;Houtgast, 1981). The SNR describes the ratio between the acoustic power of a speech signal and background noise. ...
... The SNR describes the ratio between the acoustic power of a speech signal and background noise. At a low SNR, the noise masks linguistic information in the speech signal, reducing the listeners' access to cues necessary for understanding speech (Bradley et al., 1999). Reverberation time affects the listeners' perception of speech prosody through temporal smearing, which inhibits listeners from receiving its cues such as duration and rhythm (Lecumberri et al., 2010). ...
... The effect of background noise and reverberation time on speech intelligibility of speakers with healthy voice has been well-documented in the literature (Bradley et al., 1999;Houtgast, 1981;Lecumberri et al., 2010;Yang and Bradley, 2009;Astolfi et al., 2012;Prodi et al., 2013). Recent studies showed the negative effect of background noise on the speech intelligibility of speakers with dysphonia (Ishikawa et al., 2017;Ishikawa et al., 2020), however, how the reverberation time alone or in combination with background noise affects the intelligibility of these speakers have not been described. ...
Article
Voice disorders can reduce the speech intelligibility of affected speakers. This study evaluated the effect of noise, voice disorders, and room acoustics on vowel intelligibility, listening easiness, and the listener's reaction time. Three adult females with dysphonia and three adult females with normal voice quality recorded a series of nine vowels of American English in /h/-V-/d/ format (e.g., “had”). The recordings were convolved with two oral-binaural impulse responses acquired from measurements in two classrooms with 0.4 and 3.1 s of reverberation time, respectively. The stimuli were presented in a forced-choice format to 29 college students. The intelligibility and the listening easiness were significantly higher in quiet than in noisy conditions, when the speakers had normal voice quality compared to a dysphonic voice, and in low reverberated environments compared to high reverberated environments. The response time of the listener was significantly longer for speech presented in noisy conditions compared to quiet conditions and when the voice was dysphonic compared with healthy voice quality.
... In the context of speech communication in a classroom, the first component, the transmission path of the signal between speakers and listeners is influenced by the signalto-noise ratio (SNR) and reverberation time of the room (Bradley et al., 1999;Houtgast, 1981). The SNR describes the ratio between the acoustic power of a speech signal and background noise. ...
... The SNR describes the ratio between the acoustic power of a speech signal and background noise. At a low SNR, the noise masks linguistic information in the speech signal, reducing the listeners' access to cues necessary for understanding speech (Bradley et al., 1999). Reverberation time affects the listeners' perception of speech prosody through temporal smearing, which inhibits listeners from receiving its cues such as duration and rhythm (Lecumberri et al., 2010). ...
... The effect of background noise and reverberation time on speech intelligibility of speakers with healthy voice has been well-documented in the literature (Bradley et al., 1999;Houtgast, 1981;Lecumberri et al., 2010;Yang and Bradley, 2009;Astolfi et al., 2012;Prodi et al., 2013). Recent studies showed the negative effect of background noise on the speech intelligibility of speakers with dysphonia (Ishikawa et al., 2017;Ishikawa et al., 2020), however, how the reverberation time alone or in combination with background noise affects the intelligibility of these speakers have not been described. ...
Article
No PDF available ABSTRACT Voice disorders reduces speech intelligibility. This study evaluated the effect of noise, voice disorders and room acoustics on vowel intelligibility. Twenty-nine college students listened to 11 vowels in /h/-V-/d/ format. The speech was recorded by three adult females with dysphonia and three adult females with normal voice quality. The recordings were convolved with two oral-binaural impulse responses with 0.4 s and 3.1 s of reverberation time. The intelligibility and the listening easiness were significantly higher in quiet condition, when the speakers had normal voice quality and in low reverberated environments, while the response time of the listener was longer in noise condition.
... Studies conducted in Italy [30] and the UK [31] demonstrate that vulnerable populations are also those that live in the most overcrowded conditions (i.e., number of people per square meter living in a dwelling). Living in crowded conditions and increased background noise could therefore affect the acoustic quality of recordings [32]. ...
... In light of this, it becomes consistent to hypothesize that SNR varies with the economic status of the household. Indeed, "the intelligibility of speech in rooms is influenced by both SNR and the room acoustics characteristics of the space" [32]. And since the poorest households are those most likely to experience overcrowding in their dwellings [43], and who live in the least well insulated housing [25], it seems plausible that the ambient noise in their dwellings is higher than that of the richest households. ...
... Regarding the former, it has long been known that early reflections are beneficial for SI because the auditory system can integrate their energy contribution with the direct sound (Lochner and Burger, 1964). The cutoff for the time delay that discriminates between useful and detrimental reflections for speech has been conventionally set at 50 ms, prompting the use of the acoustic parameters clarity C50 and useful-to-detrimental ratio U50 (Bradley et al., 1999;Soulodre et al., 1989) to assess the suitability of a given impulse response to ensure speech perception (ISO 3382-1, 2009). According to Bradley et al. (2003), the improvement in SI due to early reflections corresponds to the gain obtained by an equal increase in the energy content of the direct sound. ...
... SI was also assessed in these noisy conditions. Briefly, while noise and long reverberation are known to have a largely additive effect and to reduce SI (George et al., 2010;Bradley et al., 1999), the present experiments showed that their effect is not additive when single spatial percepts are considered. Noise distorts the spatial percepts but can restore some spatial qualities canceled by reverberation in quiet conditions. ...
Article
Changing the balance between the early and late reflections in the impulse response affects the clarity of speech, and also the spatial perception of the sound source is affected when the direction of the early reflections is manipulated. While the effect of noise on early reflections has long been investigated in speech intelligibility studies, it is unclear whether and how the spatial characteristics of the source are altered by noise, and whether this would influence speech intelligibility in any way. The aim of the present work was to analyze the spatial perception of a speech source in noise and its relationship, if any, with speech intelligibility. Impulse responses with specular or scattered early reflections and two different reverberant tails were used to create sound fields with controlled clarity and reverberation. It emerged that noise affects spatial cues compared to the reverberation-only (quiet) condition; ratings are consequently changed, and most percepts are distorted. Speech intelligibility is also sensitive to changes in acoustic variables and the type of reflection, but the direct association between spatial percepts and speech intelligibility is weak.
... Thiele [20] developed a room acoustical criterion for speech intelligibility referred to as definition (D 50 ); it is defined as the early-to-total energy ratio of h. An extension to D 50 that includes the noise floor η is U 50 [21], which is defined in a logarithmic scale. ...
... Among the energy ratios in Table I, C 50 and C 80 are well defined as ELERs; they are also linearly related to the subjective response in terms of useful-to-detrimental ratios [21]. This paper proposes the evaluation of these metrics on directional RIRs. ...
Conference Paper
Full-text available
Early-to-late energy ratios (ELER) are used to quantify speech intelligibility and music clarity in acoustic spaces from measurements of omnidirectional room impulse responses (RIR). Nowadays, the capture of directional RIRs is possible with spherical microphone arrays and the spherical Fourier transform. These tools are thus motivating the enhancement of omnidirectional metrics and the search for new metrics to quantify directional features of sound. This research explores a directional metric of intelligibility and clarity based on ELERs of directional RIRs. The early-to-late transition times are chosen according to the content: 50 ms for speech and 80 ms for music. The proposed metrics can therefore be interpreted as directional versions of the standard clarity indexes of speech (C50) and music (C80). Directional RIRs were captured at many seats in a large auditorium using a first-order ambisonics microphone. Supporting acoustic simulations of a cuboid room with a second-order ambisonics microphone were also used. Directional ELERs were calculated in the octave bands within the operation range of the microphones. Three directional ELER patterns were identified: an omnidirectional pattern, a dipole pointing forward and backward, and a beam pointing towards the source.
... This is expected to result in a better representation of how the listening effort varies in a real-life classroom, where children experience incorrect responses as "an expected symptom of challenging listening situations" (Hsu et al. 2017). This approach (considering both correct and incorrect responses) was adopted in several recent studies where RT was used as an indication of listening effort in TABLE 2. Listening conditions during the tasks in terms of reverberation times (T mid , averaged over the 0.5-to 2-kHz frequency bands), clarity (C 50 , averaged over the 1-to 4-kHz frequency bands), useful-to-detrimental ratio (U 50 [C 50 (1-4 kHz)]; Bradley et al. 1999), sound pressure level (L A,eq ), signal-to-noise ratio, and Speech Transmission Index (Uslar et al. 2013;Meister et al. 2018). For the mental calculation task, RTs were calculated including only correct responses, due to convergence issues with the statistical model including all trials. ...
... For an SNR of −8.5 dB, U 50 was −10.5 dB for the short and −10.3 dB for the moderate reverberation times; for an SNR of −4.5 dB, it was −6.1 and −6.5 dB, respectively. In both cases, the gap between U 50 in the two reverberation conditions was smaller than the JND of 1 dB (Bradley et al. 1999) due to the unfavorable SNR and the close proximity of the source and receiver (shorter than the critical radius r C from the mid-frequencies onwards). Therefore, it is hardly surprising that a fairly large difference in reverberation time (0.74 s) did not significantly affect accuracy or listening effort in the study by Picou et al. (2019). ...
Article
Objectives: The purpose of this study was to investigate the effect of a small change in reverberation time (from 0.57 to 0.69 s) in a classroom on children's performance and listening effort. Aiming for ecological listening conditions, the change in reverberation time was combined with the presence or absence of classroom noise. In three academic tasks, the study examined whether the effect of reverberation was modulated by the presence of noise and depended on the children's age. Design: A total of 302 children (aged 11-13 years, grades 6-8) with normal hearing participated in the study. Three typical tasks of daily classroom activities (speech perception, sentence comprehension, and mental calculation) were administered to groups of children in two listening conditions (quiet and classroom noise). The experiment was conducted inside real classrooms, where reverberation time was controlled. The outcomes considered were task accuracy and response times (RTs), the latter taken as a behavioral proxy for listening effort. Participants were also assessed on reading comprehension and math fluency. To investigate the impact of noise and/or reverberation, these two scores were entered in the statistical model to control for individual child's general academic abilities. Results: While the longer reverberation time did not significantly affect accuracy or RTs under the quiet condition, it had several effects when in combination with classroom noise, depending on the task measured. A significant drop in accuracy with a longer reverberation time emerged for the speech perception task, but only for the grade 6 children. The effect on accuracy of a longer reverberation time was nonsignificant for sentence comprehension (always at ceiling), and depended on the children's age in the mental calculation task. RTs were longer for moderate than for short reverberation times in the speech perception and sentence comprehension tasks, while there was no significant effect of the different reverberation times on RTs in the mental calculation task. Conclusions: The results indicate small, but statistically significant, effects of a small change in reverberation time on listening effort as well as accuracy for children aged 11 to 13 performing typical tasks of daily classroom activities. Thus, the results extend previous findings in adults to children as well. The findings also contribute to a better understanding of the practical implications and importance of optimal ranges of reverberation time in classrooms. A comparison with previous studies underscored the importance of early reflections as well as reverberation times in classrooms.
... Furthermore, the effects of masks on SI should be studied in relation to other factors influencing speech transmission, such as the communication path between speakers and listeners, e.g., the SNR and reverberation time. [9][10][11] Reverberation degrades speech prosody as temporal smearing inhibits the correct identification of cues, such as duration and rhythm, that convey prosodic information. 12 High noise levels can degrade the speech signal by decreasing the perceived sound level, thereby reducing SI. [9][10][11] Due to the spread of the Covid-19 virus, it is likely that teachers and professors in the majority of schools all over the world will wear face masks. ...
... [9][10][11] Reverberation degrades speech prosody as temporal smearing inhibits the correct identification of cues, such as duration and rhythm, that convey prosodic information. 12 High noise levels can degrade the speech signal by decreasing the perceived sound level, thereby reducing SI. [9][10][11] Due to the spread of the Covid-19 virus, it is likely that teachers and professors in the majority of schools all over the world will wear face masks. This challenge for speech communication will be added to already existing negative factors, such as poor acoustics and high noise levels, often experienced in classrooms. ...
Article
This study explored the effects of wearing face masks on classroom communication. We evaluated the effects of three different types of face masks (fabric, surgical and N95 masks) on speech intelligibility presented to college students in auralized classrooms. To simulate realistic classroom conditions, speech stimuli were presented in the presence of speech-shaped noise with a signal-to-noise ratio of + 3 dB under two different reverberation times (0.4 s and 3.1 s). The use fabric masks yielded significantly greater reduction in speech intelligibility compared to the other masks. Therefore, surgical masks or N95 masks are strongly recommended in teaching environments.
... Furthermore, the effects of masks on SI should be studied in relation to other factors influencing speech transmission, such as the communication path between speakers and listeners, e.g., the SNR and reverberation time. [9][10][11] Reverberation degrades speech prosody as temporal smearing inhibits the correct identification of cues, such as duration and rhythm, that convey prosodic information. 12 High noise levels can degrade the speech signal by decreasing the perceived sound level, thereby reducing SI. [9][10][11] Due to the spread of the Covid-19 virus, it is likely that teachers and professors in the majority of schools all over the world will wear face masks. ...
... [9][10][11] Reverberation degrades speech prosody as temporal smearing inhibits the correct identification of cues, such as duration and rhythm, that convey prosodic information. 12 High noise levels can degrade the speech signal by decreasing the perceived sound level, thereby reducing SI. [9][10][11] Due to the spread of the Covid-19 virus, it is likely that teachers and professors in the majority of schools all over the world will wear face masks. This challenge for speech communication will be added to already existing negative factors, such as poor acoustics and high noise levels, often experienced in classrooms. ...
Article
Full-text available
This study explored the effects of wearing face masks on classroom communication. The effects of three different types of face masks (fabric, surgical, and N95 masks) on speech intelligibility (SI) presented to college students in auralized classrooms were evaluated. To simulate realistic classroom conditions, speech stimuli were presented in the presence of speech-shaped noise with a signal-to-noise ratio of +3 dB under two different reverberation times (0.4 s and 3.1 s). The use of fabric masks yielded a significantly greater reduction in SI compared to the other masks. Therefore, surgical masks or N95 masks are recommended in teaching environments.
... ERs have been found to enhance the intelligibility of speech to some extent (Bradley, Reich, and Norcross 1999;Bradley, Sato, and Picard 2003;Arweiler, Buchholz, and Dau 2009). Depending on their delay time, direction and intensity, ERs may not be perceived per se, but their energy can be integrated with the DS due to the precedence effect (Litovsky et al. 1999). ...
... At a close distance, DS dominates, and late reverberation does not affect speech intelligibility as much as that at greater distances, where intelligibility is dependent mainly on the amount of late reverberant sound energy (Peutz 1971). Bradley, Reich, and Norcross (1999) observed the detrimental effect of increased reverberation on speech intelligibility only with smaller C50 values, that is, in conditions with more reverberant sound energy. In the conditions with relatively less reverberation and C50 values in the same range as in this study, there was no effect of reverberation on speech intelligibility. ...
Article
Full-text available
Objective: Speech-in-noise tests are widely used in hearing diagnostics but typically without reverberation, although reverberation is an inextricable part of everyday listening conditions. To support the development of more real-life-like test paradigms, the objective of this study was to explore how spatially reproduced reverberation affects speech recognition thresholds in normal-hearing and hearing-impaired listeners. Design: Thresholds were measured with a Finnish speech-in-noise test without reverberation and with two test conditions with reverberation times of ∼0.9 and 1.8 s. Reverberant conditions were produced with a multichannel auralisation technique not used before in this context. Study sample: Thirty-four normal-hearing and 14 hearing-impaired listeners participated in this study. Five people were tested with and without hearing aids. Results: No significant differences between test conditions were found for the normal-hearing listeners. Results for the hearing-impaired listeners indicated better performance for the 0.9 s reverberation time compared to the reference and the 1.8 s conditions. Benefit from hearing aid use varied between individuals; for one person, an advantage was observed only with reverberation. Conclusions: Auralisations may offer information on speech recognition performance that is not obtained with a test without reverberation. However, more complex stimuli and/or higher signal-to-noise ratios should be used in the future.
... Bradley et al. [26] investigated speech intelligibility in classrooms, examining the relation between signal-to-noise ratio and room acoustic parameters. The results from [26] show that the effect on signal-to-noise ratio is very important for speech intelligibility, and useful-to-detrimental ratios are proposed and recommended, instead of only focusing on the reverberation time. ...
... Bradley et al. [26] investigated speech intelligibility in classrooms, examining the relation between signal-to-noise ratio and room acoustic parameters. The results from [26] show that the effect on signal-to-noise ratio is very important for speech intelligibility, and useful-to-detrimental ratios are proposed and recommended, instead of only focusing on the reverberation time. Further, they concluded that an increase in early reflections could improve signal-to-noise ratio by up to 9 dB [27]. ...
Article
Full-text available
Several room acoustic parameters have to be considered in ordinary public rooms, such as offices and classrooms, in order to present the actual conditions, thus increasing demands on the acoustic treatment. The most common acoustical treatment in ordinary rooms is a suspended absorbent ceiling. Due to the non-uniform distribution of the absorbent material, the classical diffuse field assumption is not fulfilled in such cases. Further, the sound scattering effect of non-absorbing objects such as furniture are considerable in these types of rooms. Even the directional characteristic of the sound scattering objects are of importance. The sound decay curve in rooms with absorbent ceilings often demonstrate a double slope. Thus, it is not possible to use reverberation time as room parameter as a representative standalone acoustic measure. An evaluation that captures the true room acoustical conditions therefore needs supplementary parameters. The aim of this experimental study is to show how various acoustical treatments affect reverberation time T20, speech clarity C50 and sound strength G. The experiment was performed in a mock-up of a classroom. The results demonstrated how absorbers, diffusers and scattering objects influence room acoustical parameters. It is shown that to some extent the parameters can be adjusted individually by using different treatments or combination of treatments. This allows for the fine-tuning of the acoustical conditions, in order to fulfill the requirements for achieving a high-quality sound environment.
... However, in practice, the effectiveness of synthetic voice for use in stations has not been verified. For example, Tachibana [24] emphasized the importance of speech intelligibility in noisy environments, and there have been many studies on the optimal volume and speech rate of broadcasts in public spaces such as airports and train stations [1,2,24] and under high noise levels [25][26][27][28][29][30][31][32][33]. However, it is difficult to immediately apply these findings to the environments in train stations because the acoustic characteristics of the sound environment can differ depending on the location. ...
Article
Full-text available
An experimental study on the effect of the speech characteristics of the signal-to-noise ratio (SNR) and speech rate on the intelligibility of announcements at railway stations was conducted using an artificial synthetic voice. Synthesized speech has recently been used in noisy environments both indoors and outdoors, but unlike its use in quiet environments, when the environment is noisy, the intelligibility of announcements may be reduced. For railway station announcements, while natural spoken voices are currently used for multilingual announcements and disaster response broadcasts, deep neural network synthesized voices, which use deep learning, have also been adopted. However, the effect of the acoustic characteristics such as the SNR and speech rate on the intelligibility of reproduced announcements in noisy public spaces such as railway stations has not yet been clarified from a practical viewpoint. In this paper, in order to determine the appropriate SNR and speech rate for synthetic voice announcements in railway stations, auditory impressions of announcements with varying SNR and speech rate were evaluated by participants using a five-point scale. Based on the evaluations, the appropriate conditions for the broadcast of synthetic voice announcements at the ticket gate and on the platform of a station are discussed.
... Other studies [16][17][18][19][20] have shown that speech intelligibility is influenced by the reverberation time (RT) and signal-to-noise ratios (SNR). ...
Article
Full-text available
Heating ventilation and air conditioning (HVAC) systems represent one of the main noise sources inside classrooms. This explain why HVAC systems require careful design, competent installation and balancing, and regular maintenance. Many factors influence the classroom acoustical design, such as air handlers or fans, the velocity of air inside the classroom, as well as the size and acoustical treatment of ducts, returns, and diffusers. Acoustic parameters, including background-noise levels, reverberation time, and intelligibility, were analyzed in 17 classrooms at the Università Politecnica in the Marche region. The study of intelligibility was performed by measuring the objective parameters in situ and using prediction methods to determine the intelligibility score. The relationship between speech intelligibility measurements and speech intelligibility calculation has been studied. The relationship between the STI values with the background-noise levels and the reverberation time was also studied. This research shows that a comparison between predictive methods and measurement methods results in speech intelligibility for classrooms of different sizes with and without HVAC systems. The current method of calculating the voice transmission index (STI), proposed by national and international standards, has been used to determine speech intelligibility scores in classrooms. The results show that the calculation tool has computational robustness allowing its use in preliminary evaluations of speech intelligibility, design of the optimal type of school buildings, and sound amplification systems in classrooms that comply with Italian regulations.
... More recent studies showed that SNR is a good predictor of both the audibility and intelligibility of speech sounds. For example, Bradley et al. [10] discussed the combined effects of SNR and room acoustics on speech intelligibility, revealing that the effect of SNR is more important than those of room acoustics factors. To achieve good speech intelligibility, the SNR should exceed 15 dB and have an optimum reverberation time [11]. ...
Article
Full-text available
Corridors have a crucial effect on classroom acoustics, but its study is generally ignored. After examining the acoustic performance of a coupled space comprising a corridor and two classrooms, a new method for calculating the signal-to-noise ratio (SNR) by the analysis of the energy distribution is proposed in this paper. Numerical experiments are conducted using the finite element software COMSOL Multiphysics. Various parameters are explored using the proposed numerical model to determine their effect on the SNR at different receiving points in a classroom. It was observed that factors such as the corridor width, absorption of the corridor carpet materials, and aperture area (i.e., classroom doors) affect room acoustics. The results show that an absorption treatment of the corridor has a significant effect on classroom acoustic performance. The findings of this study help us understand various metrics of acoustic performance in extended coupled spaces, such as speech intelligibility in classrooms and offices.
... Parameters of greater relevance are those that consider the contribution of early reflections related to the listening experience [12], such as the Clarity index C50 (ISO 3382-1:2009 [13]), or parameters that include both room acoustics and speech-to-noise-ratios measurements, such as the Speech Transmission Index STI (IEC 60268-16:2020 [14]) or the Useful-to-detrimental sound ratio U50 [15]. Describing these parameters in international [13,16] and national acoustic standards [17,18], it has become necessary to unambiguously establish which of these is needed as the main parameter of the acoustic quality of learning environments, even considering the expensive instrumentation required for proper in situ measurement. ...
Article
Full-text available
The speech intelligibility properties of classrooms greatly influence the learning process of students. Proper acoustics can promote the inclusion of foreign students and children with learning or hearing impairments. While awareness of the topic is increasing, there is still no parameter that can describe all aspects of speech transmission inside a room. This complicates the design of classrooms and requires designers to have extensive knowledge of theory and experience. In the scientific and technical literature, there is a lack of predictive tools, easy to use by designers, which can guide the choices in the early design stages in order to move towards technical solutions able to ensure adequate levels of speech intelligibility. For this reason, in this paper, the most relevant speech intelligibility parameters found in the literature were collected and discussed. Among these, the Clarity index and Speech Transmission Index were singled out as the most effective ones, whose prediction can be made with relatively simple methods. They were then analyzed through their prediction formulas, and a tool was proposed to allow an easy estimation of the minimum total equivalent sound absorption area needed in a classroom. This tool greatly simplifies the early acous-tics design stage, allowing the intelligibility of speech within a classroom to be increased without requiring much theoretical effort on the part of the designers.
... It combines the binaural model jelfs2011 predicting SRM of a near-field target from multiple stationary noise interferers and a U/D decomposition taking into account the temporal smearing effect of reverberation on speech transmission. The U/D decomposition regards the early reflections of the target as useful and part of the signal because they reinforce the direct sound [43], whereas the late reflections are regarded as detrimental and effectively a part of the noise [44][45][46]. The revised model leclere2015 is identical to jelfs2011, except that it incorporates a front end realizing the U/D decomposition. ...
Article
Full-text available
This technical paper presents a series of speech intelligibility models that have been developed since the original version proposed by Lavandier and Culling [(2010). Journal of the Acoustical Society of America 127, 387–399]. This binaural model accounts for better-ear listening and binaural unmasking to predict the intelligibility of a near-field target speech among multiple stationary noise sources in rooms for normal-hearing listeners. Subsequent model versions allowed to consider a reverberated speech target in the far-field, envelope-modulated noise sources, and hearing-impaired listeners. As an intermediate step before considering speech maskers, a monaural version incorporating a harmonic-cancellation mechanism was recently developed to account for the effect of a stationary harmonic masker. This technical review is oriented towards model users and explains when and how each model should be used, points at its advantages and limitations, and provides an example of predictions using a data set from the literature. All these models along with the data, signals and code used to prepare the presented figures are made available within the Auditory Modeling Toolbox (AMT 1.1).
... It has been shown that in a room the envelope of the sound can be temporally smeared, depending of the level of reverberation [3], due to the individual soundpaths with various time delays taken by the sound reflections. The temporal smearing of the target speech has been shown to reduce its intelligibility [4,5]. When speech is heard in the presence of an interfering noise, speech intelligibility increases if the noise is modulated in amplitude [6]. ...
Article
Full-text available
Reverberation can have a strong detrimental effect on speech intelligibility in noise. Two main monaural effects were studied here: the temporal smearing of the target speech, which makes the speech less understandable, and the temporal smearing of the noise, which reduces the opportunity for listening in the masker dips. These phenomena have been shown to affect normal-hearing (NH) listeners. The aim of this study was to understand whether hearing-impaired (HI) listeners are more affected by reverberation, and if so to identify which of these two effects is responsible. They were investigated separately and in combination, by applying reverberation either on the target speech, on the noise masker, or on both sources. Binaural effects were not investigated here. Intelligibility scores in the presence of stationary and modulated noise were systematically compared for both NH and HI listeners in these situations. At the optimal signal-to-noise ratios (SNRs) (that is to say, the SNRs with the least amount of floor and ceiling effects), the temporal smearing of both the speech and the noise had a similar effect for the HI and NH listeners, so that reverberation was not more detrimental for the HI listeners. There was only a very limited dip listening benefit at this SNR for either group. Some differences across group appeared at the SNR maximizing dip listening, but they could not be directly related to an effect of reverberation, and were rather due to floor effects or to the reduced ability of the HI listeners to benefit from dip listening, even in the absence of reverberation.
... A saúde é um componente fundamental da qualidade de vida do ser humano, que se reflete na sua capacidade produtiva e de aprendizado. Um ambiente ruidoso dá lugar à fadiga, perda de concentração, nervosismo, reações de estresse, ansiedade, falta de memória, baixa produtividade, cansaço, irritação, problemas com as relações humanas, Diversos autores têm se debruçado sobre essa questão (BRADLEY, 1986(BRADLEY, , 1996(BRADLEY, , 1999 (1947). Um teste de articulação, de forma bastante resumida, consiste em se apresentar um ditado, podendo ser formado por palavras monossilábicas não correlacionadas, que deverá ser anotado pelos presentes. ...
... The mathematical relation for the signal-to-noise ratio is given by SNR = 20. log 10 ( S N ), where S represents the desired signal level and N represents the noise signal level [83], [84]. ...
Article
Full-text available
The gunshot event localization and classification have numerous real-time applications. The study is also useful for steering the video camera and guns in the directed direction. This paper proposes a framework that can be used for a surveillance system to accurately localize and classify the type of gunshots impregnated with wind noise. The main contribution of this paper is the localization of the gunshot for the very first time using Hadamard product with wavelet de-noising in windy conditions. We have evaluated our framework on airborne gunshots acoustic dataset, and a derived (simulated) sound dataset, as an offline scenario, using four microphones’ geometry. For localization, the proposed system outperformed with an accuracy of 99.95%. The other contribution is a sensitivity-based comprehensive examination of gunshot sound signals, with normal to strong wind noise of varying SNRs, for machine learning and deep learning classifiers to categorize the type of gunshots. For classification, it has been found, not known before for the gunshots dataset, that ELM is robust for original, normal, and strong windy environments with an accuracy of 93.01%, 91.61%, and 88.11% respectively with the threshold SNR. A comprehensive comparison of recent techniques with the proposed approach has also been added.
... Given that reduced speech intensity is the most consistently reported impact of face masks and that speaking loudly with a mask improves intelligibility, reducing environmental noise could greatly benefit communication (Bradley et al., 2002). To that end, we recommend lowering or turning off music or television in places where people need to communicate, such as in stores and restaurants. ...
Article
Full-text available
Mask-wearing during the COVID-19 pandemic has prompted a growing interest in the functional impact of masks on speech and communication. Prior work has shown that masks dampen sound, impede visual communication cues, and reduce intelligibility. However, more work is needed to understand how speakers change their speech while wearing a mask and to identify strategies to overcome the impact of wearing a mask. Data were collected from 19 healthy adults during a single in-person session. We investigated the effects of wearing a KN95 mask on speech intelligibility, as judged by two speech-language pathologists, examined speech kinematics and acoustics associated with mask-wearing, and explored KN95 acoustic filtering. We then considered the efficacy of three speaking strategies to improve speech intelligibility: Loud, Clear, and Slow speech. To inform speaker strategy recommendations, we related findings to self-reported speaker effort. Results indicated that healthy speakers could compensate for the presence of a mask and achieve normal speech intelligibility. Additionally, we showed that speaking loudly or clearly—and, to a lesser extent, slowly—improved speech intelligibility. However, using these strategies may require increased physical and cognitive effort and should be used only when necessary. These results can inform recommendations for speakers wearing masks, particularly those with communication disorders (e.g., dysarthria) who may struggle to adapt to a mask but can respond to explicit instructions. Such recommendations may further help non-native speakers and those communicating in a noisy environment or with listeners with hearing loss.
... To use teleconference meetings as a feasible alternative to in-person meetings, speech intelligibility is a crucial factor. The reduced speech intelligibility due to echoes introduced by the room, makes it crucial to be aware of the nearby walls that introduce these echoes [2]. The introduction of smart loudspeakers gives rise to opportunities to estimate room parameters to improve the sound experience of the user. ...
Preprint
Full-text available
Having knowledge on the room acoustic properties, e.g., the location of acoustic reflectors, allows to better reproduce the sound field as intended. Current state-of-the-art methods for room boundary detection using microphone measurements typically focus on a two-dimensional setting, causing a model mismatch when employed in real-life scenarios. Detection of arbitrary reflectors in three dimensions encounters practical limitations, e.g., the need for a spherical array and the increased computational complexity. Moreover, loudspeakers may not have an omnidirectional directivity pattern, as usually assumed in the literature, making the detection of acoustic reflectors in some directions more challenging. In the proposed method, a LiDAR sensor is added to a loudspeaker to improve wall detection accuracy and robustness. This is done in two ways. First, the model mismatch introduced by horizontal reflectors can be resolved by detecting reflectors with the LiDAR sensor to enable elimination of their detrimental influence from the 2D problem in pre-processing. Second, a LiDAR-based method is proposed to compensate for the challenging directions where the directive loudspeaker emits little energy. We show via simulations that this multi-modal approach, i.e., combining microphone and LiDAR sensors, improves the robustness and accuracy of wall detection.
... All rights reserved. be useful for SI as they can be integrated with the direct sound [9][10][11][12][13]. In addition, there is a loss in the transmission of speech from the front speaker to the rear listeners because the cabin is separated into two compartments by the seat-back [3], which are found as the major contributor to the interior sound absorption inside an automobile [14]. ...
Article
The head orientation of the listener significantly affects speech intelligibility (SI) in automobiles due to the effect of binaural listening and special acoustic conditions such as early reflections and seat-back occlusions. However, this issue has not been studied with subjective tests yet. This study investigates SI with various head orientations of a listener in automobiles with an subjective experiment. The sentence speech reception thresholds (SRT) in Mandarin Chinese are measured via headphones virtually in an automobile environment, and compared with results in a weak-reflective listening room. A virtual speaker is located in the front-passenger seat, the right-back seat and the left-back seat in sequence, by convolving the target speech with corresponding binaural room impulse responses measured on a dummy head in the driver’s seat with five head orientations. Result shows that the SRT variations caused by head orientations are up to 5 dB in automobile, lower than that of 9 dB in listening room. Under various head orientations, the SRT in automobile decreases as the virtual speaker moves closer to the front lateral direction, which is similar to the change rule in listening room that is mainly determined by the effect of binaural listening. Overall, a lower SRT in automobile can be obtained when the listener in the driver seat turns their head inward, i.e., towards the right. In comparison with the result in listening room, early reflections improve SI in automobile by an SRT decrease of up to 3.5 dB, while seat-back occlusions reduce SI by an SRT increase of up to 5.3 dB. The early reflections play a more significant role when the listener is in an adverse position, e.g., for the head orientations and speaker locations making direct sounds difficult to reach the listener’s ears. Comparison between the SRTs based on speech transmission index (STI) and STI-SI models and the SRTs measured by subjective experiment indicates that the STI-based objective method only partially expresses the variation in SI obtained from subjective experiment, and the difference between the subjective results and the objective results is mainly caused by the inapplicability of the STI-SI models derived from the traditional room in the automotive environment. The present work is relevant to understanding the combined effect of various factors on SI under such a special acoustic condition in automobile.
... Under the natural sound condition, assuming that SNR ≥ 15 dBA represents the ideal background noise conditions during typical classroom use (Bradley et al. 1999b), when the SNR is equal to 15 dBA and the STI is equal to the recommended value of 0.62 (IEC 2020), according to Eq. (1), the RT 0.5-1kHz is approximately equal to 0.66 s, which is less than some suggested standards. According to Eq. (2), which has been derived from the author's previous research (Zhu et al. 2014), when the STI is 0.62, the speech intelligibility (SI in the equation) is approximately 92%. ...
Article
The acoustic environment of the classroom is one of the most important factors influencing the teaching and learning effects of the teacher and students. It is critical to ensure good speech intelligibility in classrooms. However, due to some factors, it may not be easy to achieve an ideal classroom acoustic environment, especially in large-scale multimedia classrooms. In a real renovation project of 39 multimedia classrooms in a university, seven typical rooms were selected, and the acoustic environment optimisation design and verification for these multimedia classrooms were performed based on simulation. First, the acoustic and sound reinforcement design schemes were determined based on the room acoustics software ODEON. Next, the effects of the optimisation design were analysed, and the simulated and measured results were compared; the accuracy of using the reduced sound absorption coefficients, which were determined empirically, was also examined. Finally, the recommended reverberation times (RTs) in multimedia classrooms corresponding to speech intelligibility were discussed, the effectiveness of the speech transmission index (STI) as a primary parameter for classroom acoustic environment control was considered, and the acoustic environment under the unoccupied and occupied statuses was compared. The results revealed that although there are many factors influencing the effect of classroom acoustic environment control, an adequate result can be expected on applying the appropriate method. Considering both the acoustic design and visual requirements also makes the classroom likely to have a good visual effect in addition to having a good listening environment.
... Bradley et al. [9] recommend focusing on increasing the ratio of early reflections rather than on lowering the reverberation time in rooms used for speech. It has also been shown in another study by Bradley and Reich that C 50 can, to some extent, complement a low signal-to-noise (S/N) ratio [10]. Yang and Bradley investigated speech intelligibility for different room acoustic conditions, finding high scores in intelligibility along with an increase in early reflections; S/N also affected speech intelligibility, with a lower effect being seen for varied reverberation time [11]. ...
Article
Full-text available
In environments such as classrooms and offices, complex tasks are performed. A satisfactory acoustic environment is critical for the performance of such tasks. To ensure a good acoustic environment, the right acoustic treatment must be used. The relation between different room acoustic treatments and how they affect speech perception in these types of rooms is not yet fully understood. In this study, speech perception was evaluated for three different configurations using absorbers and diffusers. Twenty-nine participants reported on their subjective experience of speech in respect of different configurations in different positions in a room. They judged sound quality and attributes related to speech perception. In addition, the jury members ranked the different acoustic environments. The subjective experience was related to the different room acoustic treatments and the room acoustic parameters of speech clarity, reverberation time and sound strength. It was found that people, on average, rated treatments with a high degree of absorption as best. This configuration had the highest speech clarity value and lowest values for reverberation time and sound strength. The perceived sound quality could be correlated to speech clarity, while attributes related to speech perception had the strongest association with reverberation time.
... The influence of room acoustics on the stimulus modulation has been widely investigated for speech intelligibility. It has been demonstrated that the reverberation and the background noise attenuate the natural fluctuations of the speech signal which are necessary for speech comprehension, which leads to poorer speech intelligibility for longer reverberation times and high noise levels (Bradley et al., 1999). The reverberation time (T) is defined as the time it takes for a sound to decrease by 60 dB in a room after an abrupt termination of the sound source (ISO 3382-1, 2009). ...
Article
Full-text available
The sound-field auditory steady-state response (ASSR) is a promising measure for the objective validation of hearing-aid fitting in patients who are unable to respond to behavioral testing reliably. To record the sound-field ASSR, the stimulus is reproduced through a loudspeaker placed in front of the patient. However, the reverberation and background noise of the measurement room could reduce the stimulus modulation used for eliciting the ASSR. As the ASSR level is heavily dependent on the stimulus modulation, any reduction due to room acoustics could affect the clinical viability of sound-field ASSR testing. This study investigated the effect of room acoustics on the level and detection rate of sound-field ASSR. The study also analyzed whether early decay time and an auditory-inspired relative modulation power model could be used to predict the changes in the recorded ASSR in rooms. A monaural auralization approach was used to measure sound-field ASSR via insert earphones. ASSR was measured for 15 normal-hearing adult subjects using narrow-band CE-Chirps® centered at the octave bands of 500, 1000, 2000, and 4000 Hz. These stimuli were convolved with simulated impulse responses of three rooms inspired by audiological testing rooms. The results showed a significant reduction of the ASSR level for the room conditions compared with the reference anechoic condition. Despite this reduction, the detection rates for the first harmonics of the ASSR were unaffected when sufficiently long recordings (up to 6 min) were made. Furthermore, the early decay time and relative modulation power appear to be useful predictors of the ASSR level in the measurement rooms.
... Symbols marked with a * are related to statistically significant (p-value < 0.008) differences in the SRM according to the t-test analysis; symbols marked with parentheses reveal statistically significant different SRM between informational and energetic masker conditions. classrooms [66,67], even though further research is needed that involves binaural metrics. On this matter, Rennies et al. [68] found a high and significant correlation between measured and predicted SRT, the latter being obtained from a binaural speech intelligibility model based on Definition, that is exactly related to C50 as shown in [34]. ...
Article
Enhanced speech intelligibility in realistic acoustic scenarios guarantees effective communication. Noise and reverberation degrade speech intelligibility, but the combined effect of reverberation, informational noise and position of target, listener and noise source on speech intelligibility still needs to insights. This work investigates the effect of real complex acoustic scenarios on speech intelligibility for adults. In two primary-school classrooms with reverberation time of 0.4 s and 3.1 s, receivers were located on axis with the target at increasing distances, and noise sources were both co-located and spatially separated from the target at different distances providing various degrees of energetic and informational masking. Noise level was maintained constant and set at 60 dB at the receiver's position, regardless of the noise source distance. The longer reverberation time resulted in worse speech recognition thresholds (SRT80s) by 6 dB on average. The competitive effect of informational noise resulted in increased SRT80s by 7 dB on average, in comparison to energetic noise. The SRT80 increases by ∼2 dB with doubling the target-to-receiver distance when the noise source is close to the receiver, accounting for both reverberations and noise types. The spatial release from masking resulted in improved speech intelligibility by up to ∼3 dB when the noise source is 1 m far from the receiver for energetic masker for low reverberation and, unexpectedly, for the informational masker in high reverberation. This may indicate that a perceptual segregation mechanism sorts out competing voices of informational masker according to their directions in least favorable listening situations.
... Examples of poor architectural solutions that reduce speech intelligibility due to increased noise and reverb time, echo, or flutter echo are shown in paper [1]. Experimental studies have shown that noise in classrooms is more noticeable than late reflections of sound [2]. Acceptable speech intelligibility in lecture rooms is achieved due to early reflections of sound, even if the speaker's head is directed in the opposite direction from a listener [3]. ...
Article
Full-text available
The scores of speech intelligibility, obtained using objective and subjective methods for three university lecture rooms of the small, medium, and large sizes with different degrees of filling, were presented. The problem of achieving high speech intelligibility is relevant for both students and university administration, and for architects designing or reconstructing lecture rooms. Speech intelligibility was assessed using binaural room impulse responses which applied an artificial head and non-professional quality audio equipment for measuring. The Speech Transmission Index was an objective measure of speech intelligibility, while the subjective evaluation of speech intelligibility was carried out using the articulation method. Comparative analysis of the effectiveness of parameters of impulse response as a measure of speech intelligibility showed that Early Decay Time exceeded the score of the T30 reverberation time but was ineffective in a small lecture room. The C50 clarity index for all the considered lecture rooms was the most informative. Several patterns determined by the influence of early sound reflections on speech intelligibility were detected. Specifically, it was shown that an increase in the ratio of the energy of early reflections to the energy of direct sound leads to a decrease in speech intelligibility. The exceptions are small, up to 30?40 cm, distances from the back wall of the room, where speech intelligibility is usually slightly higher than in the middle of the room. At a distance of 0.7–1.7 m from the side walls of the room, speech intelligibility is usually worse for the ear, which is closer to the wall. The usefulness of the obtained results lies in refining the quantitative characteristics of the influence of early reflections of sound on speech intelligibility at different points of lecture rooms.
... In a study by Bradley et al. [11] the recommendation for rooms for speech was to focus rather on increasing the early reflections than on lowering the reverberation times. Furthermore, it was found by Bradley and Reich that C 50 can to some extent complement a low S/N [12]. ...
Article
Full-text available
In ordinary public rooms absorbent ceilings are normally used. However, reflective material such as diffusers can also be useful to improve the acoustic performance for this type of environment. In this study, different combinations of absorbers and diffusers have been used. The study investigates whether a test group of 29 people perceived sound in an ordinary room differently depending on the type of treatment. Comparisons of the same position in a room for different configurations as well as different positions within one configuration were made. The subjective judgements were compared to the room acoustic measures T20, C50 and G and the difference in the values of these parameters. It was found that when evaluating the different positions in a room, the configuration including diffusers was perceived to a greater extent as being similar in the different positions in the room when compared to the configuration with absorbers on the walls. It was also seen that C50 was the parameter that mainly affected the perception, with the difference needing to be 2 dB to recognize a difference. However, the room acoustic measurements could not fully explain the differences obtained in perception. In addition, the subjective sound image created by different types of treatments was also shown to have an important impact on the perception.
... Classrooms are learning environments, and as such, the built environment of a classroom should allow students to not only hear speech intelligibly but also to comprehend the meaning behind it and learn the material presented. Researchers have investigated the relationship between the acoustical characteristics of classrooms and the ability of English-speaking students with normal-hearing to understand words or phrases in those rooms (1,2,3,4). These studies and earlier work as reviewed in booklets on classroom acoustics produced by the Acoustical Society of America (5,6) have clearly shown that higher background noise levels (BNL) and longer reverberation times (RT) result in poorer speech intelligibility. ...
Conference Paper
Full-text available
The movement for improved classroom acoustics has primarily been grounded on studies that show how building acoustics (i.e. background noise levels and room reverberation) affect speech intelligibility, as determined by speech recognition tests. What about actual student learning, though? If students do not understand each spoken word in the classroom perfectly, can they still manage to achieve high scholastic success? This presentation will review two recent studies conducted at the University of Nebraska-Lincoln, linking classroom acoustic conditions to student learning outcomes and speech comprehension (rather than simply recognition). In the first, acoustic measurements in two public school districts in the Midwest were correlated to elementary student achievement scores. Results indicate that higher background noise levels, greater than 40 dBA, may lead to unacceptable scholastic performance in language and reading tests. The second study focuses on how room acoustic conditions impact English speech comprehension of native-English-speaking listeners in contrast to English-as-second-language (ESL) listeners, a group which includes 21% of children in the United States K-12 school system. Conclusions are that higher reverberation times and background noise levels do reduce speech comprehension in both groups of listeners, but adverse noise conditions are particularly more detrimental on ESL listeners.
... En outre, un rapport signal sur bruit (SNR) trop bas crée de vraies difficultés d'intelligibilité. Il a été montré qu'un SNR de -5dB permet la bonne intelligibilité de seulement 67% d'un contenu alors que ce taux passe à 84% lorsque le SNR est de 0 pour atteindre les 90% à 10dB (Bradley et al., 1999). Il s'avère que le bruit moyen d'une salle de classe ne permet pas une bonne intelligibilité, que l'enfant soit malentendant ou normo-entendant (Finitzo-Hieber et al., 1978). ...
Thesis
Full-text available
Cette thèse s’intéresse à l’impact de la dysphonie à travers trois grands axes : la représentation de sa propre voix, la transmission du message et la perception d’autrui. Nous nous basons sur deux populations de femmes professeures des écoles (PE), l’une de 709 PE interrogées via internet et l’autre de 61 locutrices PE enregistrées en conditions contrôlées. À partir d’une évaluation perceptive experte sur l’échelle GRBAS, nos locutrices ont été catégorisées en deux groupes de 37 témoins et 24 dysphoniques légères. Outre les importantes plaintes vocales et l’altération de la qualité de vie qui touchent nos deux populations, nous observons un effet de l’âge des élèves sur la prévalence des troubles vocaux. L’analyse des productions de nos locutrices en lecture calme ou face à une classe bruyante suggère que les PE utilisent des stratégies d’adaptation dans leur pratique professionnelle qui pourraient être impactées par la dysphonie. La dysphonie semble également impacter la transmission de l’information à destination d’élèves de 7 à 10 ans puisque des temps de réaction plus longs sont relevés lors du décodage du contraste de voisement dans une tâche d’identification de mot lorsque la consigne est produite par une locutrice dysphonique. Enfin, suite à une première tâche de catégorisation libre, l’attribution de traits de personnalité par un panel d’auditeurs naïfs se basant uniquement sur la voix des PE met en évidence des profils vocaux associés à des représentations plus ou moins positives. L’accord modéré constaté entre le degré de trouble vocal perçu et l’évaluation experte de la dysphonie semble lié à la perception positive de la raucité par les auditeurs naïfs.
... Quoting previous studies 21, 22 Choi 19 stresses the significance of early reflections in achieving the required speech intelligibility in classrooms. Bradley 23 stated that the clarity measure C 50representing the ratio of early reflections to late reflections at an early time interval of 50 ms -is the most preferred measure of clarity for speech sounds. ...
Article
High reverberation times (RTs) have always been an acoustic barrier to effective learning in classrooms. Acoustic corrections to reduce RT involve complex acoustic treatment. Previous studies have indicated that classrooms in most schools do not meet the established acoustic criteria, as the school authorities refrain from such acoustic treatment. Aim of the study was to optimize the RT within classrooms through easily-implementable acoustic corrections. Different combinations of acoustic corrections have been experimented in eight classrooms, through a step-by-step approach to optimize RT. After each acoustic modification, the RT was measured and the speech clarity parameter C 50 , was estimated. At the final step, RT of the classrooms was diminished to a mean value of 0.74 s (standard deviation = 0.04) from the initial mean value of 4.37 s (standard deviation = 0.42). C 50 values corresponding to the final acoustic correction were found to fall within good speech intelligibility scale.
... Ishikawa et al. conducted a subjective experiment to observe the influence of background noise on speech intelligibility by dysphonia [9]. Bradley et al. investigated the combined influences of SNR and room acoustics parameters on speech intelligibility and concluded that the effect of SNR is much more important [10]. Van Wijngaarden et al. compared the speech intelligibility in noise environment for nonnative and native listeners [11]. ...
Article
Full-text available
Speech intelligibility is affected by various interfering factors in a speech transmission system. Noise is one of the most common affecting factors. Subjective listening experiments were, respectively, carried out in pink noise, speech noise, and white noise-interfering environment. The perceptual characteristics of the initials, finals, tones, and syllable intelligibility were analyzed, and the function relationships between Chinese speech intelligibility and SNR in noise environment were concluded, which could be used to evaluate or predict the Chinese speech intelligibility under noise transmission conditions.
... Results of studies on the degree of influence of noise and reverberation on speech intelligibility in classrooms are presented in [5], [8], [9], [10]. It has been shown that noise is much more dangerous than reverberation, due to the closeness of noise sources (talking students), and due to the similarity of the noise and speech spectra. ...
Chapter
Full-text available
The articulation tests of noised and reverberated speech were carried out under different listening modes: (1) diotic speech presentation through headphones, (2) diotic speech presentation through computer speakers, and (3) dichotic speech presentation through headphones. Developed software toolkit was used for automation of subjective assessment of speech intelligibility. The results of the articulation tests showed that the diotic representation of the speech through headphones or computer speakers leads to almost identical results provided that the distance between the listener and the computer speakers is close to critical distance value. This result is of practical value, since it means the admissibility of using computer loudspeakers during articulation tests. Articulation testing for dichotic speech presentation through headphones has showed that a room can be seen as filter which can reduce speech intelligibility and that direct sound has a greater effect on speech intelligibility compared to earlier reflections.
... Quoting previous studies 21, 22 Choi 19 stresses the significance of early reflections in achieving the required speech intelligibility in classrooms. Bradley 23 stated that the clarity measure C 50representing the ratio of early reflections to late reflections at an early time interval of 50 ms -is the most preferred measure of clarity for speech sounds. ...
Article
The quality of the classroom environment has a great impact on the physical and mental health of students and teachers. The COVID-19 pandemic has highlighted the need for new measures and ventilation strategies to be implemented in educational buildings, to ensure indoor air quality in classrooms and to minimise the risk of airborne virus transmission. However, these ventilation protocols can influence the acoustic quality of classrooms and negatively affect students’ speech perception and learning performance. This study presents the results obtained from a field measurement campaign carried out to assess the acoustic characteristics of classrooms of the Fuentenueva Campus (University of Granada) and Azurém Campus (University of Minho). Different ventilation operating scenarios (active and inactive) were assessed to evaluate their impact on the indoor acoustic conditions. The reverberation time (RT), the only parameter used in both countries' regulations to assess acoustic conditions, was found to be higher on both campuses than the RT limits values. Comparison of the measured Speech Transmission Index (STI) and background noise values in the active and inactive ventilation scenario showed a clear variation of the indoor acoustic conditions. The background noise was higher in the active ventilation scenarios (40–57 dBA) than in the inactive ventilation scenarios (34–48 dBA). The average STI values obtained on both campuses for the inactive and active scenarios were 0.54 and 0.51, respectively. In some classrooms an STI difference of 0.1 was found between scenarios. The results obtained in this study provide a broader understanding of the acoustic conditions in university classrooms in Spain and Portugal. The results evidence the need of consider the synergies between the indoor acoustic and air quality conditions to ensure both: the spaces are safe and the acoustic conditions do not interfere with students' learning. The findings show that compliance with the current RT requirements does not ensure that classroom acoustic conditions do not interfere with student performance, and therefore, regulations need to be revised to include additional factors to ensure proper acoustic performance.
Article
Sound ray-tracing simulation in a 3D urban model was performed to assess the spatial distribution of speech intelligibility of outdoor public notification systems. There was a strong correlation between the objective index of intelligibility obtained from the simulation and that obtained from actual broadcasts. However, at some measurement points, the plots departed from the regression line due to sound diffraction and background noise. Furthermore, two methods to improve the spatial distribution were discussed based on the simulation: one is to place the sound source as high as possible, and the other is to use roads as sound propagation paths.
Article
The human auditory system displays a robust capacity to adapt to sudden changes in background noise, allowing for continuous speech comprehension despite changes in background environments. However, despite comprehensive studies characterizing this ability, the computations that underly this process are not well understood. The first step towards understanding a complex system is to propose a suitable model, but the classical and easily interpreted model for the auditory system, the spectro-temporal receptive field (STRF), cannot match the nonlinear neural dynamics involved in noise adaptation. Here, we utilize a deep neural network (DNN) to model neural adaptation to noise, illustrating its effectiveness at reproducing the complex dynamics at the levels of both individual electrodes and the cortical population. By closely inspecting the model's STRF-like computations over time, we find that the model alters both the gain and shape of its receptive field when adapting to a sudden noise change. We show that the DNN model's gain changes allow it to perform adaptive gain control, while the spectro-temporal change creates noise filtering by altering the inhibitory region of the model's receptive field. Further, we find that models of electrodes in nonprimary auditory cortex also exhibit noise filtering changes in their excitatory regions, suggesting differences in noise filtering mechanisms along the cortical hierarchy. These findings demonstrate the capability of deep neural networks to model complex neural adaptation and offer new hypotheses about the computations the auditory cortex performs to enable noise-robust speech perception in real-world, dynamic environments.
Article
Full-text available
Speech intelligibility in rooms is assumed to depend on the amount of reverberation, and the distribution of sound energy in the impulse response between early and later reflections, the former being considered beneficial, the latter detrimental. This assumption is based on the analysis of a single channel, which is monaural. When the binaural capacities of the auditory system are considered, other phenomena come into play, supporting speech recognition mainly by comparing signal levels and times of arrival at the ears. Starting from these basic mechanisms, the present work shows that, for given monaural conditions, binaural cues are influenced by the types of sound reflection in a room via the correlations they produce on the signals at the ears. A fully scattering scenario, and a totally flat boundary scenario were used to investigate listeners’ performance in a speech intelligibility task in a virtual room with spatialized noise conditions. The amount of correlation at the ears of the listener of the signals coming from both the source and the masker were found to affect speech intelligibility - the stronger either correlation, the greater the speech intelligibility - and their joint effect was larger than the two effects taken separately. A specular setting for the sound reflections was associated with both better reception thresholds and a better usage of spatial cues when deciphering speech. The implications for the acoustic design of rooms with a view to facilitating speech intelligibility are discussed.
Conference Paper
In recent Italian Law, the DM 11/01/2017 about Environmental criteria, reference values for the acoustic indoor quality descriptors of public buildings are imposed. These refence values are in compliance with the national standards UNI 11532-1 and UNI 11532-2. The part two of the series standard, in particularly, describes the procedures and gives limit values for the acoustic comfort descriptors for schools. Regard to schools, adequate acoustic comfort targets are required in terms of indoor noise level and acoustic quality. Indoor acoustic quality targets refer to the reverberation time (RT), the Clarity (C50) and/or the speech intelligibility (STI). The limit values for these indoor acoustic quality parameters, established by the national standards, are related to the measurement methods results; however, it is necessary to use prediction methods to estimate these parameters during the design phase. The aim of this study is to verify the prediction methods accuracy used to determine intelligibility score. The study was developed to model the existing calculation method of speech transmission index (STI) in Matlab software to determine the acoustical speech intelligibility in school classrooms. A school building located in central Italy, in the Marche Region, was taken as a case study. This research wants to determine a correlation factor between the results of predictions and measurements speech intelligibility methods.
Thesis
In the current regulations, acoustic requirements to guarantee good speech intelligibility (SI) conditions in primary schools' environments are published within the BB93 normative (Building Regulations 93 2015) but no specific recommendations or values are given for higher educational learning spaces. |n addition, because Universities vary from usual primary or secondary schools' environments, they present additional and different constraints. In this thesis, the acoustic performance of five teaching spaces of Solent University was analysed and compared with subjective responses of students and staff members in order to point out the main constraints which might be found within a University environment. A relationship was found between background noise levels, which were higher than the standard requirements, and the perceived annoyance reported by staff members, suggesting a target for more restrictive background noise requirements, especially for higher room volumes. Conversely, although the Reverberation Time (RT) measurements of the five spaces were conform to English requirements specified in the BB93 document, vocal fatigue issues were reported by staff members, suggesting the need for lower and upper RT values as a function of classrooms volumes, as suggested, for example, in the DIN18401 German legislation. After analysing both quantitative and qualitative data, this study suggests that the values of Clarity (C50) and Strength (G) should also be included within an updated regulation standard, in order to both better represent auditory perception and to assess appropriate speech clarity and loudness levels among several distances within large lecture theatres. Lastly, given the relevance of sound reinforcement systems to improve SI while assuring vocal comfort for the lecturers, the Speech transmission index for PA systems (Sti-Pa) method should be used to assess the effectiveness of classrooms installed sound reinforcement systems.
Article
Recently, various intelligent evacuation guidance systems that can be applied in buildings were studied. Technology development for the evacuation of vulnerable people such as the visually-impaired is necessary. Voice guidance is a method used to lead the visually-impaired toward the evacuation route. However, it is necessary to review whether it is possible to hear and understand the voice guidance during the sounding of fire alarms. In this study, simulations were conducted to predict the sound power level of the voice guidance device that can secure an acceptable sound transmission index of the guide sound and the appropriate distance from the voice guidance device, when a fire alarm sound is generated in a hallway space. The study found that an acceptable sound transmission index was achieved when the sound power level was 100 dB and the appropriate type of sound device was found to be a necklace-type headset or a regular headphone.
Article
Full-text available
In ordinary public rooms, such as classrooms and offices, an absorbent ceiling is the typical first acoustic action. This treatment provides a good acoustic baseline. However, an improvement of specific room acoustic parameters, operating for specific frequencies, can be needed. It has been seen that diffusing elements can be effective additional treatment. In order to choose the right design, placement, and quantity of diffusers, a model to estimate the effect on the acoustics is necessary. This study evaluated whether an SEA model could be used for that purpose, particularly for the cases where diffusers are used in combination with an absorbent ceiling. It was investigated whether the model could handle different quantities of diffusing elements, varied diffusion characteristics, and varied installation patterns. It was found that the model was sensitive to these changes, given that the output from the model in terms of acoustic properties will be reflected by the change of diffuser configuration design. It was also seen that the absorption and scattering of the diffusers could be quantified in a laboratory environment: a reverberation chamber. Through the SEA model, these quantities could be transformed to a full-scale room for estimation of the room acoustic parameters.
Article
Acoustical measurements of three different masks, surgical, KF94, and N95 respirator (3 M 9210+) were performed and compared with the results obtained with no mask on a dummy head mouth simulator to understand the acoustical effects of the three different masks on speech sounds. The speech intelligibility and perceived difficulty of understanding speech sounds with and without an N95 mask were also measured using speech signals convolved with previously measured impulse responses in 12 occupied university classrooms. The acoustic attenuations with the three masks were greatest in front of the talker. The surgical, and KF94 masks resulted in 6–12 dB reductions of high frequency sounds between 2 kHz and 5 kHz, and the N95 respirator decreased sound levels by an additional 2–6 dB at these frequencies. Both surgical, and KF94 masks performed acoustically better at high frequencies between 2 kHz and 5 kHz than N95 mask did. The mean trends of the speech intelligibility test results indicate that young adult listeners at university achieve a mean score of 90% correct at a signal-to-noise ratio (SNR) value of + 8 dBA or higher for no mask conditions, which is a 4 dBA lower SNR value than for N95 mask conditions. The intelligibility scores obtained with N95 mask conditions decreased the correct scores by a maximum of 10% at a SNR of 5 dBA or lower compared to the results obtained with no mask conditions. The perceived difficulty ratings obtained in N95 mask conditions increased the ratings by a maximum of 10% at lower SNR values compared to the results obtained in no mask conditions. Achieving higher SI scores of 95% or more doesn’t indicate that the listeners experience no difficulty at all in understanding speech sounds. Higher SNR values are beneficial for achieving better speech communication for both no mask and an N95 mask on a talker in classrooms.
Chapter
This chapter reviews binaural models available to predict speech intelligibility for different kinds of interference and in the presence of reverberation. A particular effort is made to quantify their performances and to highlight the a priori knowledge they require in order to make a prediction. In addition, cognitive factors that are not included in current models are considered. The lack of these factors may limit the ability of current models to predict speech understanding in real-world listening situations fully.
Article
Full-text available
Speech intelligibility in rooms is determined by both room acoustics characteristics as well as speech-to-noise ratios. These two types of effects are combined in measures such as useful-to-detrimental sound ratios which are directly related to speech intelligibility. This paper reports investigations of optimum acoustical conditions for classrooms using the ODEON room acoustics computer model. By determining conditions that relate to maximum useful-to-detrimental sound ratios, optimum conditions for speech are determined. The results show that an optimum mid-frequency reverberation time for a classroom is approximately 0.5 s, but speech intelligibility is not very sensitive to small deviations from this optimum. Speech intelligibility is influenced more strongly by ambient noise levels. The optimum location of sound absorbing material was found to be on the upper parts of the walls.
Article
A thorough investigation was made of the influence of a single echo as a function of different parameters on the audibility of speech. An apparatus was build for the artificial generation of echoes and measurements were made with a large number of observers under specified conditions.
Article
The Signal-to-Noise Ratio devised by Lochner and Burger contributed an objective design index for predicting speech intelligibility. Their index provided a measure of useful and detrimental reflected speech energy according to the integration and masking characteristics of hearing, and enabled predictions to be made from impulse measurements in models. However, it was found necessary to extend the Signal-to-Noise Ratio theory to account for the effect of fluctuating ambient background noise on speech intelligibility. A modified Signal-to-Noise Ratio was derived from a best-fitting empirical correlation with speech intelligibility in a series of measurements in existing auditoria. In the modified Signal-to-Noise Ratio ambient background noise is no longer considered in terms of its steady state characteristics but more specifically in terms of its transient and spectral characteristics given by the concept of the L10 PNC level. The index has been applied as design criteria to prediction and to evaluation techniques.
Article
Intelligibility tests were performed by teachers and pupils in classrooms under a variety of (road traffic) noise conditions. The intelligibility scores are found to deteriorate at (indoor) noise levels exceeding a critical value of — 15 dB with regard to a teacher's long-term (reverberant) speech level. The implications for external noise levels are discussed: typically, an external noise level of 50 dB(A) would imply that the critical indoor level is exceeded for about 20 per cent of teachers.
Article
An observer in an auditorium receives first the direct sound from the source and after that a large number of reflections from the different surfaces in the enclosure. This very intricate sound pattern is analysed by the hearing system and gives rise to those acoustical qualities normally attributed to the auditorium.The present article summarizes the work carried out in this laboratory with the object of throwing more light on the interpretation, by the hearing mechanism, of reflection patterns in auditoria and the application of these principles to the design of auditoria.
Article
The concept of the Modulation Transfer Function (MTF) can successfully be applied to evaluate the quality of speech transmission from a talker to a listener in an auditorium. Typically, depending on the auditorium acoustics, the intensity modulations contained in the original sound are to some extent reduced when measured at a listener's location, especially for higher modulation frequencies. The implementation of such an acoustical MTF analysis with a sinusoidally modulated test signal is described in detail. The performance of a sound transmission system as revealed by the MTF can be expressed in one single index (the Speech Transmission Index, STI), which relates well to the performance as determined by intelligibility tests with talkers and listeners. A review is given of a series of studies on various aspects of the chain of relations between auditorium acoustics, MTF, STI, and speech intelligibility, illustrating the use of this approach for estimating speech intelligibility, either from MTF calculations at the design stage of an auditorium or from MTF measurements in actual situations.
Article
A practical examination of the factors influencing the decay range of room acoustics measurements using maximum-length-sequence measurement techniques is presented. A series of systematic measurements were made to examine the effects of output signal level, ambient noise level, sequence length, and number of averages on the decay range of measured sound decays. The effects of small amounts of time variation and the sequence length relative to the measured reverberation times are also included. A step-by-step approach to optimizing the decay range of measured room responses concludes the discussion.
Article
C-50 is an early-to-late arriving sound ratio used to assess the influence of room acoustics on the clarity and intelligibility of speech. A just noticeable difference in C-50 values was determined for speech sounds in simulated sound fields. Over a range of C-50 values from -3 to +9dB, representing most situations in rooms for speech, a just noticeable difference was estimated to be 1.1 dB. The corresponding just noticeable difference in Speech Transmission Index (STI) values was 0.03. This is similar to previous related estimates for speech and musical signals. To improve the acoustical characteristics of a room for speech, it is probably necessary to increase C-50 by approximately 3 dB to create a readily detectable improvement in everyday situations.
Article
Speech intelligibility tests and acoustical measurements were made in ten occupied classrooms. Octave-band measurements of background noise levels, early decay times, and reverberation times, as well as various early/late sound ratios, and the center time were obtained. Various octave-band useful/detrimental ratios were calculated along with the speech transmission index. The interrelationships of these measures were considered to evaluate which were most appropriate in classrooms, and the best predictors of speech intelligibility scores were identified. From these results ideal design goals for acoustical conditions for classrooms were determined either in terms of the 50-ms useful/detrimental ratios or from combinations of the reverberation time and background noise level.
Article
Three different types of acoustical measures were compared as predictors of speech intelligibility in rooms of varied size and acoustical conditions. These included signal-to-noise measures, the speech transmission index derived from modulation transfer functions, and useful/detrimental sound ratios obtained from early/late sound ratios, speech, and background levels. The most successful forms of each type of measure were of similar prediction accuracy, but the useful/detrimental ratios based on a 0.08-s early time interval were most accurate. Several physical measures, although based on very different calculation procedures, were quite strongly related to each other.
Speech intelligibility in rooms Redistribution subject to ASA license or copyright
  • Bradley
  • Reich
J. Acoust. Soc. Am., Vol. 106, No. 4, Pt. 1, October 1999 Bradley, Reich, and Norcross: Speech intelligibility in rooms Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 138.251.14.35 On: Wed, 17 Dec 2014 23:34:27