Book

Psychoacoustics

Authors:

Chapters (12)

This chapter addresses the hearing area and the threshold in quiet.
In this chapter, the preprocessing of sound in the peripheral system and information processing in the neural system are addressed.
The masking of a pure tone by noise or by other tones is described in this chapter. Both psychoacoustical tuning curves and temporal effects in masking are addressed, effects related to the pulsation threshold are described, and finally, models of masking are developed.
In this chapter the pitch of pure tones, complex tones and noise bands is addressed, and models for spectral pitch and virtual pitch are developed. In addition, the pitch strength of various sounds is assessed.
The concept of critical bands is introduced in this chapter, methods for determining their characteristics are explained, and the scale of critical-band rate is developed. The definitions of critical-band level and excitation level are given and the three-dimensional excitation level versus critical-band rate versus time pattern is illustrated.
Two different kinds of sound changes are discussed in this chapter. One is the variation that may be compared to variation in water level: there is always some water but the level varies as a function of time. In acoustics, modulations are typical changes of the sort we call variations. The other kind of change is that of differences. One apple may be different from another apple. In this case, we compare one piece with another piece. In acoustics, this means that we compare one sound with another sound presented after a pause. Because these two kinds of changes may activate different processing features in our hearing system, the first by direct and quick comparison and the second by activating and introducing memory, it is necessary to differentiate strictly between the two kinds of changes. Just-noticeable variations are useful in producing scales of sensations related to position, for example pitch through the frequency-location transformation as discussed in Chap. 5. However, both just-noticeable variations and just-noticeable differences are important as the “stones” on which the “house of sensations” is built.
Previously, there has been a tendency to transfer everything in steady-state sounds not related to the sensations of loudness or pitch, to a residual basket of sensations called timbre. Using this definition of timbre, it is necessary to extract from the mixture of sensations those that may be important. The sensation of “sharpness”, which may be related to what is called “density”, seems to be one of these. Closely related to sharpness, however inversely, is a sensation called sensory pleasantness. This sensation, however, also depends on other sensations such as roughness, loudness, and tonalness.
In this chapter, the fluctuation strength of amplitude-modulated broad-band noise, amplitude-modulated pure tones and frequency-modulated pure tones is addressed, and the dependence of fluctuation strength on modulation frequency, sound pressure level, modulation depth, centre frequency, and frequency deviation is assessed. In addition, the fluctuation strength of modulated sounds is compared to the fluctuation strength of narrow-band noises. Finally, a model of fluctuation strength based on the temporal variation of the masking pattern or loudness pattern is proposed.
Using a 100% amplitude-modulated 1-kHz tone and increasing the modulation frequency from low to high values, three different areas of sensation are traversed. At very low modulation frequencies the loudness changes slowly up and down. The sensation produced is that of fluctuation. This sensation reaches a maximum at modulation frequencies near 4 Hz and decreases for higher modulation frequencies. At about 15 Hz, another type of sensation, roughness, starts to increase. It reaches its maximum near modulation frequencies of 70 Hz and decreases at higher modulation frequencies. As roughness decreases, the sensation of hearing three separately audible tones increases. This sensation is small for modulation frequencies near 150 Hz; it increases strongly, however, for larger modulation frequencies. This behaviour indicates that roughness is created by the relatively quick changes produced by modulation frequencies in the region between about 15 to 300 Hz. There is no need for exact periodical modulation, but the spectrum of the modulating function has to be between 15 and 300 Hz in order to produce roughness. For this reason, most narrow-band noises sound rough even though there is no periodical change in envelope or frequency. Roughness is again a sensation which we can consider while ignoring other sensations.
When we talk about duration, we normally think of objective duration, i.e. physical duration measured in seconds, milliseconds or minutes. This is so, although we often check durations by listening to them in music, for example, or by giving a talk, where a short silence can add emphasis. If such durations can be measured by listening, they cannot be objective durations but must be subjective because they correspond to sensations. Subjective duration is not drastically different from objective duration if the durations of long-lasting sound bursts are compared. Therefore, it is often assumed that subjective duration and objective duration are almost equal. This is not so, however, when the duration of sound bursts is compared with the duration of sound pauses. In this case, drastic differences appear which indicate the need to consider subjective duration as a separate sensation.
In this chapter, the physical data of the temporal envelope of sounds eliciting the perception of a subjectively uniform rhythm are shown. In addition, the rhythm of speech and music is discussed. For music, the hearing sensation of rhythm is compared with the hearing sensations of fluctuation strength and subjective duration. Finally, a model based on the temporal variation of loudness is proposed for the hearing sensation rhythm.
In our human society, acoustical communication plays a very important role, because besides our receiving system (our ears), we also have a transmitting system (our speech organ). Therefore, the applications of psychoacoustics are spread across many different fields. Most often, psychoacoustical data provide the fundamental basis from which solutions to problems (such as finding the limiting characteristics of transmitting systems, or the limits of noise production) are elaborated. Even in the region of art, to which music belongs, the characteristics of our hearing system as the receiver of music play the dominant role. Consequently, psychoacoustics is very important also in musical acoustics. In view of the diversity of the field, it is impossible to discuss a large number of different applications in detail. However, this chapter gives some impression of how much psychoacoustics is involved in the different fields that have been mentioned, and may give those who want to solve similar problems some hints on how they might proceed. The list of papers for each section of the chapter may also be of help.
... Such level-based fluctuation metrics are similar in principle, and include the peak index [4], office noise index [5,6], noise pollution level [5][6][7], noise climate [8], and M A,eq [2]. Some studies have reported reverberation times (T in seconds) [2,5], and psychoacoustic metrics [9] (e.g., loudness, etc.) [6,10,11]. The other major group of related metrics include those derived from octave/one-third octave band spectra such as noise rating, preferred noise criterion, balance between spectral regions, etc [2,6,[10][11][12][13][14]. ...
... Those based on psychoacoustics. These metrics include binaural loudness (N in sones [9]) calculated using Moore and Glasberg's time-varying binaural loudness model [26] with the middle ear transfer function presented in [27]; sharpness (S in acum [9]) calculated using [28]; roughness (R in asper [9]) calculated using [29], fluctuation strength (FS in vacil [9]) calculated using [30], and loudness fluctuation (N Fluctuation , which is a measure of fluctuation of loudness values over time, calculated using [28]. For all metrics besides N, the value reported is the average value of the two ears. ...
... Those based on psychoacoustics. These metrics include binaural loudness (N in sones [9]) calculated using Moore and Glasberg's time-varying binaural loudness model [26] with the middle ear transfer function presented in [27]; sharpness (S in acum [9]) calculated using [28]; roughness (R in asper [9]) calculated using [29], fluctuation strength (FS in vacil [9]) calculated using [30], and loudness fluctuation (N Fluctuation , which is a measure of fluctuation of loudness values over time, calculated using [28]. For all metrics besides N, the value reported is the average value of the two ears. ...
Article
Open-plan offices (OPOs) have been around for more than half a century now, chronicling the vicissitudes of workplace topography amongst other factors. This paper addresses one such factor – the sound environment in occupied OPOs in relation to several objective workplace parameters, using measurements in contemporary OPOs and comparisons with studies over the last 50 years. Omnidirectional and binaural sound measurements were conducted in 43 offices during typical working hours. The results describe variation in several acoustic and psychoacoustic metrics, and present statistical models that predict these metrics as a function of the number of workstations in offices. LA,eq of 53.6 dB is typical for occupied OPOs, with spectral slope of approximately −4 dB/octave. LA,eq values do not vary much over the workplace parameters studied (e.g., floor plate area, work activity, etc), except for −2.7 dB and −4.1 dB differences between offices with/without carpeting, and offices with ceiling absorption but with/without carpeting, respectively; most likely from reduced floor impact noise leading to speech level reduction. Sound fluctuation, as characterised by the metric Noise Climate (NCl: LA10 – LA90) and the psychoacoustic Fluctuation Strength (FS), decreases significantly with increasing number of workstations in OPOs. This suggests lesser auditory distraction in larger offices, which needs further investigation. In terms of historical trends, OPOs have become quieter over the years, especially background noise quantified as LA90, although there are several subtleties. Overall, current findings can inform several OPO design perspectives including policy documents, provide values for laboratory simulations of OPO acoustic environments, help interpret subjective impressions of OPO occupants, etc.
... The main purpose of SQ metrics is to replace jury tests with an acoustic measurement that would provide an accurate prediction of sound quality judgements made by human listeners. In recent studies several SQ metrics, such as loudness (Fastl, Zwicker, 2007; Klonari et al., 2011), sharpness (Leite et al., 2008;Wang et al., 2007), roughness (Aures, 1985a;Miśkiewicz et al., 2007; Szczepańska-Antosik, 2008; Vencovský, 2016), and tonality (Aures, 1985b;Cuddy et al., 2007;Terhardt et al., 1982) were developed and used for sound quality evaluation (Carletti, 2013; Pleban, 2010). The calculations of individual SQ metrics have been used to compute combined, overall measures of sound quality, such as pleasantness and unbiased annoyance (Kaczmarek, Preis, 2010). ...
... Pleasantness is predicted from the calculations of loudness, sharpness, roughness, and tonality. A detailed description of those metrics is available in the literature (Fastl, Zwicker, 2007). Pleasantness is calculated in MATLAB using Eq. ...
... To decompose the signal into 24 critical bands the wavelet parameters (η, σ, s) should be tuned in accordance with the critical bands. The frequency parameters of the mother wavelet: the lower limit frequency, the upper limit frequency, and the centre frequency correspond exactly to those used in Zwicker's critical band model (Fastl, Zwicker, 2007).The centre angular frequency ω c , based on Eq. (8) is: ...
Article
The purpose of this study was to develop a sound quality model for real time active sound quality control systems. The model is based on an optimal analytic wavelet transform (OAWT) used along with a back propagation neural network (BPNN) in which the initial weights and thresholds are determined by particle swarm optimisation (PSO). In the model the input signal is decomposed into 24 critical bands to extract a feature matrix, based on energy, mean, and standard deviation indices of the sub signal scalogram obtained by OAWT. The feature matrix is fed into the neural network input to determine the psychoacoustic parameters used for sound quality evaluation. The results of the study show that the present model is in good agreement with psychoacoustic models of sound quality metrics and enables evaluation of the quality of sound at a lower computational cost than the existing models.
... This minimum perceivable tone difference increases towards higher frequencies [153,190]. The field of psycho-acoustics [42] has made great advances in describing the hearing impression of signals represented in the frequency domain by defining quantities such as e.g. loudness, sharpness, roughness and tonality. ...
... However, most loudspeakers have their own transfer function and distort the sound somewhat with respect to the real sound which was recorded/predicted. According to [42], headphones, even rather cheap ones, typically have a flat transfer function and give more accurate hearing impressions than loudspeaker boxes. The singular value decomposition (SVD) is the swiss army knife of engineering. ...
... The physical sound pressure p B 3 , being the input to the human hearing system, should then be subjected to some form of 'transfer function' that accounts for the human perceived annoyance. There have been promising advances in the field of psycho-acoustics to study these dependencies [42]. A simpler construction of an objective function would subject the sound pressure p B 3 to e. g. an A-weighting [38] to account at least for the frequency-dependent human perception of sound. ...
Thesis
Full-text available
[ Link to PhD defense video: https://www.youtube.com/watch?v=IEVuF2rJOYs&t=6s ] This thesis is the result of a 4-year collaboration between the Technical University of Munich and the BMW Group. The goal was to apply substructuring methods to the Noise Vibration Harshness (NVH) engineering needed for integrating electric climate compressors in upcoming vehicles. The compressor is one of the major contributors to the cabin noise in battery electric vehicles (BEVs). An accurate yet practical development process for its vehicle integration is crucial for industry. Specifically, the aim was to simulate the compressor noise in the cabin for different, virtual design variants of the isolation concept. Therefore, the methods from two broader fields were applied: First, the excitation of the compressor was modeled with component transfer path analysis (TPA) methods. Second, the full transfer path from the compressor to the driver’s ear is assembled from multiple subcomponent models, via dynamic substructuring (DS). For accomplishing the above mentioned goals, different gaps in the current technology have been identified, which will be addressed in this thesis. With frequency based substructuring (FBS), a subclass of DS, it is possible to couple experimental and numerical substructure models in a virtual assembly. For the compressor, it was found that including rigid body models in the transfer path is a valuable addition. The proper formulation and integration of rigid body models in the framework of FBS will be presented. Another bottleneck at the onset of this project, was the proper modeling of rubber bushings in the transfer path. A novel method for experimentally identifying accurate substructure models of rubber isolators was developed. The rotating components in the compressor introduce gyroscopic effects that influence its dynamics. A novel substructuring method for virtually coupling gyroscopic terms to a component could prove that these effects are not relevant for the compressor case. The compressors excitation is described by blocked forces. Applying the blocked forces to the substructured transfer path of the assembly allows to simulate the sound in a virtual prototype. One goal was to make the simulated results audible to non-acoustic experts, which required the creation of sound files. This allowed for a subjective comparison of different designs at an early development stage. Since the noise predictions with TPA are typically in the frequency domain, some signal processing is required to create sound files in the time domain. Different methods for auralization will be compared, which could not be found in the existing TPA literature. Due to the inverse process for identifying the blocked forces, measurement noise can be amplified to unacceptably high levels, which are audible in the sound predictions. Regularization methods have the potential to significantly suppress the noise amplification, which is explained and exemplified for blocked force TPA. Additionally, it was found that only the structure-borne sound transmission was not sufficient to describe the compressor noise in the cabin. The compressor is also directly radiating air-borne sound from its housing, which will be included in the NVH model by means of equivalent monopoles. The application examples at the thesis’ end are extending the current state-of-the-art, by showing how the modular vehicle models can be used for early phase, parametric design optimizations on a complex NVH problem.
... In contrast, the subjectively perceived annoyance of sounds tends to correlate with psychoacoustic loudness as well as with other psychoacoustic metrics. The exact correlations can usually only be determined in listening experiments [33]. It is to be expected that the movements of CI users performed in everyday life usually lead to less complex and possibly also smaller acceleration processes of the implant in the head than they were provoked in the laboratory. ...
... Die subjektiv wahrgenommene Lästigkeit von Geräuschen korreliert hingegen eher mit der psychoakustischen Lautheit sowie mit weiteren psychoakustischen Metriken. Die genauen Zusammenhänge lassen sich in der Regel nur in Hörversuchen ermitteln[33]. Es ist zu erwarten, dass die im Alltag durchgeführten Bewegungen von CI-Nutzern meist zu weniger komplexen und möglicherweise auch geringeren Beschleunigungsvorgängen des Implantats im Kopf führen, als sie im Labor provoziert wurden. ...
Article
Full-text available
Severe to profound hearing loss and deafness are treated with a cochlear implant (CI) fitting. Today, the indication for CI fitting in adult patients with single sided deafness (SSD) has been recognized and financed in Germany. A magnet in the center of the CI receiver coil attaches the transmitter coil, which is worn on the outside of the head. CI from the manufacturers Advanced Bionics (Valencia, California, USA), Cochlear (Macquarie, Australia), and MED-EL (Innsbruck, Austria) are equipped with movable magnets so that MRI examinations in CI patients can be performed without side effects and without the risk of a magnet dislocation. For a 16-year-old male adolescent presented in this case report, who suffered from SSD, the indication for a CI was established after detailed diagnostics and a CI stimulator was implanted. During the postoperative period, the patient described a click-noise of the CI magnet, which was caused by jerky movements of the head (head shaking) as well as when walking. This click-noise was perceived as a severe impairment. Despite intensive rehabilitation, no hearing success was achieved due to the click-noise, which was perceived as stressful. Finally, an explantation was performed at the patient's request. The manufacturer checked the explant and could not find any indications of functional disorders. Acoustic measurements were performed on the explant at Technical University of Darmstadt in an anechoic chamber by “shaking” the explant repeatedly with a test set-up developed for this purpose. The equivalent continuous sound pressure level (Leq) measured at a distance of 100 mm was 29 dB above 1.5 kHz with a peak level (Lpeak) of 67.2 dB. An implant demonstration specimen was investigated as well, where a Leq of 31 dB and an Lpeak of 66.4 dB were measured using the same measurement setup. In SSD patients, sound - similar to bone conducting hearing aids - could be transcranially transmitted via bone conduction as well as via soft tissue, so that the normal hearing ear can perceive the click-noise of the CI magnet. The click-noise showed dominant sound pressures at frequencies above 1.5 kHz. In this frequency range, bone and soft tissue conduct the sound particularly well. In addition, the transcranial attenuation at 1.5 kHz is around 0 dB, which may also contribute to the hearing of the click-noise through the healthy ear. In order to reduce click-noise, the CI model under investigation has now been modified in terms of design. Conclusion: When advising SSD patients for a CI fitting, the possible occurrence of click-noise in the opposite ear should be pointed out.
... Aspiration durations for each step of the VOT continua appear in Table 2. A non-linear (logarithmic) step size was chosen because psycho-acoustic perception tends to follow Weber's law (subjective sensation is proportional to the logarithm of the stimulus intensity); e.g., Fastl and Zwicker (2006). See Rosen and Howell (1981) for results on VOT, and Stevens (2000, p. 228) for a similar effect on the perception of duration of burst. ...
Article
Full-text available
Multimodal integration is the formation of a coherent percept from different sensory inputs such as vision, audition, and somatosensation. Most research on multimodal integration in speech perception has focused on audio-visual integration. In recent years, audio-tactile integration has also been investigated, and it has been established that puffs of air applied to the skin and timed with listening tasks shift the perception of voicing by naive listeners. The current study has replicated and extended these findings by testing the effect of air puffs on gradations of voice onset time along a continuum rather than the voiced and voiceless endpoints of the original work. Three continua were tested: bilabial (“pa/ba”), velar (“ka/ga”), and a vowel continuum (“head/hid”) used as a control. The presence of air puffs was found to significantly increase the likelihood of choosing voiceless responses for the two VOT continua but had no effect on choices for the vowel continuum. Analysis of response times revealed that the presence of air puffs lengthened responses for intermediate (ambiguous) stimuli and shortened them for endpoint (non-ambiguous) stimuli. The slowest response times were observed for the intermediate steps for all three continua, but for the bilabial continuum this effect interacted with the presence of air puffs: responses were slower in the presence of air puffs, and faster in their absence. This suggests that during integration auditory and aero-tactile inputs are weighted differently by the perceptual system, with the latter exerting greater influence in those cases where the auditory cues for voicing are ambiguous.
... On a second level, the participants' perception of the telemeeting is a function of time as well. At the level of perceptual processing, an example for temporal effects in auditory perception is temporal masking, see e.g., [283], [284]. At a higher level, auditory and visual objects and other perceptual features are formed based on the telemeeting characteristics captured by the human auditory and visual systems. ...
Article
Telemeetings such as audiovisual conferences or virtual meetings play an increasingly important role in our professional and private lives. For that reason, system developers and service providers will strive for an optimal experience for the user, while at the same time optimizing technical and financial resources. This leads to the discipline of Quality of Experience (QoE), an active field originating from the telecommunication and multimedia engineering domains, that strives for understanding, measuring, and designing the quality experience with multimedia technology. This paper provides the reader with an entry point to the large and still growing field of QoE of telemeetings, by taking a holistic perspective, considering both technical and non-technical aspects, and by focusing on current and near-future services. Addressing both researchers and practitioners, the paper first provides a comprehensive survey of factors and processes that contribute to the QoE of telemeetings, followed by an overview of relevant state-of-the-art methods for QoE assessment. To embed this knowledge into recent technology developments, the paper continues with an overview of current trends, focusing on the field of eXtended Reality (XR) applications for communication purposes. Given the complexity of telemeeting QoE and the current trends, new challenges for a QoE assessment of telemeetings are identified. To overcome these challenges, the paper presents a novel Profile Template for characterizing telemeetings from the holistic perspective endorsed in this paper.
... The improvement in terms of the degree of similarity resulted in the synthesis of more accurately tuned signals as per the relevant reference recorded signals (Figure 4). The average absolute deviation of the fundamental frequency of the synthesized from the recorded signals reduced from 40 cents in the case of PM signals, which corresponds to an interval of almost half semitone (a half-semitone deviation is 50 cents) to only 2 cents in the case of PM-OPT signals (which is a non-noticeable difference [28]). In 5 out of 9 notes, the PM-OPT model led to the synthesis of perfectly tuned signals with the relevant recordings (0 cent deviation, Figure 4 notes 2-5, 7). ...
Article
Full-text available
A simulation of a musical instrument is considered to be a successful one when there is a good resemblance between the model’s synthesized sound and the real instrument’s sound. In this work, we propose the integration of physical modeling (PM) methods with an optimization process to regulate a generated digital signal. Its goal is to find a new set of values of the PM’s parameters’ that would lead to a synthesized signal matching as much as possible to reference signals corresponding to the physical musical instrument. The reference signals can be: (a) described by their acoustic characteristics (e.g., fundamental frequencies, inharmonicity, etc.) and/or (b) the signals themselves (e.g., impedances, recordings, etc.). We put this method into practice for a commercial recorder, simulated using the digital waveguides’ PM technique. The reference signals, in our case, are the recorded signals of the physical instrument. The degree of similarity between the synthesized (PM) and the recorded signal (musical instrument) is calculated by the signals’ linear cross-correlation. Our results show that the adoption of the optimization process resulted in more realistic synthesized signals by (a) enhancing the degree of similarity between the synthesized and the recorded signal (the average absolute Pearson Correlation Coefficient increased from 0.13 to 0.67), (b) resolving mistuning issues (the average absolute deviation of the synthesized from the recorded signals’ pitches reduced from 40 cents to the non-noticeable level of 2 cents) and (c) similar sound color characteristics and matched overtones (the average absolute deviation of the synthesized from the recorded signals’ first five partials reduced from 41 cents to 2 cents).
... Conversely, several authors emphasize that due to the multiple factor interactions, it is not recommended to study soundscape in public spaces through recordings, laboratory tests, or simulations, and suggest that the most appropriate method is the on-site survey, to obtain the Real influence of all these aspects (Yang and Kang 2005;Zhang and Kang 2007). Although recent studies as Kang et al. (2016) advises the methodology of performing records of acoustic environments with binaural technology to analyse psychoacoustics parameters (loudness, roughness, sharpness, and fluctuation strength) that link both physical attributes and the perceptual approaches (Fastl and Zwicker 2007) as predicted human responses, nevertheless, those not necessarily represent how people perceive sounds, psychophysical approaches use to underestimate the impact of other environmental stressors by analysing sound isolated from their natural, ecological context (Raimbault 2006). ...
Article
Full-text available
Several policies have been developed to improve the quality of life for older adults in cities, with soundscape being one of the factors that most influence general comfort. This work aims to present a systematic review based on PRISMA methodology, of the existing scientific literature that identifies the differences in sensitivity, noise annoyance, acoustic comfort, and urban soundscape preference of the elders with the rest of the population. Soundscape evaluation is a complex issue, understood as the relationship between human beings and the acoustic environment, based on sound, environment, and people’s perceptions of those. Among the personal variables, age is related to physiological, psychological, social, and cultural factors that lead to evaluate urban soundscape in a certain way. Our results show that the greatest difference among older adults and other age groups is presented on noise annoyance and the least differences were presented on sound level evaluation. More research is needed in this field, to achieve comfortable urban areas through soundscape design which should be pleasant and inclusive for all, including elders as a vulnerable group, and contribute to improving their health and quality of life.
... Data for the same narrowband LNN stimuli as used in this study were available but only for monaural presentation. However, data for a narrowband uniform exciting noise (UEN1, Fastl and Zwicker, 2007) with a center frequency of 1,370 Hz and a bandwidth of 210 Hz were available for both monaural aided and binaural aided conditions. For narrowband stimuli, aided conditions imply that not the same level but the same (monaural) loudness was presented to each ear. ...
Article
Full-text available
The individual loudness perception of a patient plays an important role in hearing aid satisfaction and use in daily life. Hearing aid fitting and development might benefit from individualized loudness models (ILMs), enabling better adaptation of the processing to individual needs. The central question is whether additional parameters are required for ILMs beyond non-linear cochlear gain loss and linear attenuation common to existing loudness models for the hearing impaired (HI). Here, loudness perception in eight normal hearing (NH) and eight HI listeners was measured in conditions ranging from monaural narrowband to binaural broadband, to systematically assess spectral and binaural loudness summation and their interdependence. A binaural summation stage was devised with empirical monaural loudness judgments serving as input. While NH showed binaural inhibition in line with the literature, binaural summation and its inter-subject variability were increased in HI, indicating the necessity for individualized binaural summation. Toward ILMs, a recent monaural loudness model was extended with the suggested binaural stage, and the number and type of additional parameters required to describe and to predict individual loudness were assessed. In addition to one parameter for the individual amount of binaural summation, a bandwidth-dependent monaural parameter was required to successfully account for individual spectral summation.
... with the damping matrix D. The electric machine thus acts as a disturbing noise source. This includes all frequencies below 10 kHz [34]. The relevant speed range for the machine considered here is up to 34 % of the maximum speed. ...
Article
The present study investigates how sensorineural hearing loss affects the perception of suprathreshold tonal components in noise. Masked threshold, tonality, and loudness of the tonal content are measured for one, two, or four simultaneously presented sinusoids. The levels of the suprathreshold tonal components were chosen relative to the individual masked thresholds. Masked thresholds were significantly higher for the hearing-impaired listeners than for normal-hearing listeners. In general, tonality was the same for hearing-impaired and normal-hearing listeners at the same level above threshold. The same was found for the loudness of the tonal content.
Article
Full-text available
Nowadays additive manufacturing is affected by a rapid expansion of possible applications. It is defined as a set of technologies that allow the production of components from 3D digital models in a short time by adding material layer by layer. It shows enormous potential to support wind musical instruments manufacturing because the design of complex shapes could produce unexplored and unconventional sounds, together with external customization capabilities. The change in the production process, material and shape could affect the resulting sound. This work aims to compare the music performances of 3D-printed trombone mouthpieces using both Fused Deposition Modelling and Stereolithography techniques, compared to the commercial brass one. The quantitative comparison is made applying a Design of Experiment methodology, to detect the main additive manufacturing parameters that affect the sound quality. Digital audio processing techniques, such as spectral analysis, cross-correlation and psychoacoustic analysis in terms of loudness, roughness and fluctuation strength have been applied to evaluate sounds. The methodology herein applied could be used as a standard for future studies on additively manufactured musical instruments.
Article
Este artículo presenta el marco teórico sobre algunos de los problemas de las ciudades del siglo XXI como son el cambio climático, la urbanización y el envejecimiento de la población, con el fin de establecer los antecedentes que guían la investigación del confort térmico y acústico de las personas mayores en espacios públicos de la ciudad. Para ello, se analizan las políticas internacionales, los conceptos de urbanismo y todas sus vertientes contemporáneas que han surgido como producto de la adaptación de las ciudades a los fenómenos demográficos y climáticos, así como su influencia en la salud de la población. Adicionalmente se conceptualiza el fenómeno del envejecimiento de la población, la realidad mundial, regional y local de España en cuanto a este proceso demográfico y las políticas públicas que se han planteado como medida de adaptación en las ciudades. Finalmente, se explican los conceptos sobre confort térmico y acústico, identificando el estado de la cuestión y las principales variables que influyen, mediante un resumen de la revisión bibliográfica sistemática, sobre la percepción de confort térmico y acústico de adultos mayores y sus diferencias con otros grupos de edad.
Article
This review presents a sequence of exemplary experience-based encounters with self-organizing systems on different levels of difficulty. Based on hands-on experiments and creative modeling it provides a viable educational road to build up a deeper understanding of self-organization principles and their comprehensive nature. Theories of self-organization describe how patterns, structures and new types of behavior emerge in energetically open systems, resulting from the local interaction of many components. As an external control instance is missing, the underlying philosophy is counterintuitive to our habits of causal thinking. This thematic and conceptual framework impacts on many STEM domains and presents a blueprint for modeling emergent structures and complex functions in natural and technological systems. It reveals unifying principles that can help in reducing, in structuring and, finally, in understanding and controlling the emerging complexity. An overview across diverse STEM domains highlights the role of this overarching concept. This cross-disciplinary approach can help in improving the dialogue and the knowledge exchange between the individual fields. Moreover, in a self-referential fashion, the modeling of self-organization provides us with fresh perspectives to reflect our own creative processes.
Article
The interior sound is a key criterion for the purchase decision of a vehicle. Especially audible tonal components in the interior of electric vehicles significantly lower the pleasantness and possibly the acceptance of electric powertrains. These tonal components are commonly audible within the electric vehicle interior during transient driving conditions. The present study investigates the influence of different parameters of these tonal components, which are typically observed in recordings of interior sounds, on the perceived magnitude of tonal content (MOTC). Instead of recordings, artificial sounds were used to allow for controlled and systematic variations of these parameters. Parameters are the level and number of tonal components. In addition, the influence of temporal amplitude modulations, which, in the case of interior sounds, result from structural resonances, were investigated. Psychoacoustic experiments with normal-hearing listeners indicate that all these parameters have an impact on the perceived MOTC of the sounds: the perceived MOTC increases as the level of the tonal components increases and when an overtone was added. An increase in modulation frequency or modulation depth also yields to an increase in perceived MOTC. The experimental results are compared to predictions of a model of tonalness in a musical context and to the current version of the ECMA-74 standard. It is shown that those models predict basic trends that were observed in the data.
Chapter
This chapter starts by discussing the most fundamental of questions regarding an auditory object: under what conditions does it exist? Two physical attributes limit the audibility of a frequency component of sound: the sound pressure level (SPL) and frequency. The attributes interact with tonal signals; the SPL threshold of audibility depends in a complicated manner on frequency. First, the chapter discuss these issues. It then discusses the basics of masking. Spectral masking, how the masker sound affects the detection threshold of the test sound, can be best described by plotting the masking threshold as a function of frequency. A conceptual illustration of temporal masking is shown, both for a sound occurring before the masker, called backward masking, or pre‐masking, and after the masker, called forward masking, or post‐masking. Finally, the chapter discusses the first steps of spectral analysis conducted in hearing; that is, the characteristics of the frequency bands in hearing.
Article
Full-text available
The investigation of technologies that can improve the sustainability of the air transport system requires not only the development of alternative fuel concepts and novel vehicle technologies but also the definition of appropriate assessment strategies. Regarding noise, the assessment should reflect the situation of communities living near airports, i.e., not only addressing sound levels but also accounting for the annoyance caused by aircraft noise. For this purpose, conventional A-weighted sound pressure level metrics provide initial but limited information as the level-and frequency-dependency of the human hearing is accounted for in a simplified manner. Ideally, subjective evaluations are required to adequately quantify the perceived short-term annoyance associated with aircraft noise. However, listening tests are time-consuming and not suitable to be applied during the conceptual aircraft design stage, where a large solution space needs to be explored. Aiming at bridging this gap, this work presents a methodology for the sound quality assessment of computational aircraft noise predictions, which is hereby conducted in terms of objective psychoacoustic metrics. The proposed methodology is applied to a novel medium-range vehicle with fan noise shielding architecture during takeoff and landing procedures. The relevance of individual sound sources, i.e., airframe and engine noise contributions, and their dependencies on the aircraft architecture and flight procedures are assessed in terms of loudness, sharpness, and tonality. Moreover, the methodology is steered towards community noise assessment, where the impacts on short-term annoyance brought by the novel aircraft design are analysed. The assessment is based on the modified psychoacoustic annoyance, a metric that provides a quantitative description of human annoyance as a combination of different hearing sensations. The present work is understood as an essential step towards low-annoyance aircraft design.
Article
Simplified mathematical theories are essential for determining causalities and for predicting the perception evoked by a given stimulus, which provides the evident need for experimental analysis and modelling of hearing. This chapter describes several computational auditory models and their applications. The auditory models are classified as simple psychoacoustic models, filter bank models, cochlear models, hair‐cell models, models for cognitive processing, and models of binaural interaction. The auditory models are related to a relatively low level of neural processing. It is useful and necessary to simulate the functionality of hearing at higher levels to understand the functionality of the auditory system in detail. The chapter describes a few functional models for higher‐level processing. Some of them also have a limited physiological basis, but, in general, they are hypothetical models. A plethora of binaural and monaural models of spatial hearing have been proposed, and some of them are discussed in this chapter.
Chapter
There are four central quantities or dimensions of psychoacoustics, namely pitch, loudness, timbre, and subjective duration, all of which are relatively well defined and orthogonal to each other, except perhaps timbre. This chapter describes a few of these quantities which are useful in the research on psychoacoustics or in technical applications. These quantities are sharpness, roughness, fluctuation strength, tonality, consonance, and dissonance. When the modulation rate far exceeds 16 Hz, our hearing is unable to follow the level of sound, and the sound is perceived as rough, associated with the psychoacoustic quantity roughness. It has been proposed that onsets which define impulsiveness can be identified by finding the regions where the positive slope of the instantaneous sound pressure level exceeds 10 dB/s. Roughness is also perceived to be high when broadband noise is amplitude modulated. The chapter contains a discussion on two psychoacoustic views of music in two specific perspectives: melody and harmony, and rhythm.
Article
Developmental dyslexia is most commonly associated with phonological processing difficulties. However, children with dyslexia may experience poor speech-in-noise perception as well. Although there is an ongoing debate whether a speech perception deficit is inherent to dyslexia or acts as an aggravating risk factor compromising learning to read indirectly, improving speech perception might boost reading-related skills and reading acquisition. In the current study, we evaluated advanced speech technology as applied in auditory prostheses, to promote and eventually normalize speech perception of school-aged children with dyslexia, i.e., envelope enhancement (EE). The EE strategy automatically detects and emphasizes onset cues and consequently reinforces the temporal structure of the speech envelope. Our results confirmed speech-in-noise perception difficulties by children with dyslexia. However, we found that exaggerating temporal “landmarks” of the speech envelope (i.e., amplitude rise time and modulations)—by using EE—passively and instantaneously improved speech perception in noise for children with dyslexia. Moreover, the benefit derived from EE was large enough to completely bridge the initial gap between children with dyslexia and their typical reading peers. Taken together, the beneficial outcome of EE suggests an important contribution of the temporal structure of the envelope to speech perception in noise difficulties in dyslexia, providing an interesting foundation for future intervention studies based on auditory and speech rhythm training.
Article
This study aims to evaluate the effects of architectural layouts of emergency departments on activity patterns/work routines and consequent noise levels. Three Danish hospitals’ emergency departments that had different layouts were investigated via on-site noise measurements over three days and observations of noisy activities. The time-averaged noise levels in the three emergency departments turned out to be significantly different, ranging from 50 to 59 dBA. During the observation of noisy activities, the noise levels and occurrences of individual activities were noted and correlated with the long-term measurements. Major noise sources that have high correlations with the three-day noise levels are identified as loud staff communication and noise from alarms/electronic devices. Potential remedies are suggested in connection to the emergency departments’ architectural layouts. Especially unnecessarily loud communication between medical staff and loud and frequent equipment/patient transportation noise need to be improved for more comfortable acoustic environments.
Article
This study aims to examine the influence of colour exposure on noise annoyance. Previous studies in the literature have focused mostly on the effects of colour exposure on loudness judgements; however, due to the cognitive nature of multisen-sory perception, the influence of colour on noise annoyance also needs to be investigated. Our experiments were designed to administer non-information-carrying sound signals (i.e. white noise) and visual stimuli (i.e. abstract colour samples) and to limit visual and auditory contextual information. Participants were asked to evaluate noise annoyance on an 11-point International Commission on Biological Effects of Noise (ICBEN) scale. The experiments were conducted in the form of audiovisual tests. During these tests, random combinations of three white noise sound samples with sound pressure levels of 66 dB(A) (−4 dB[A] acoustic condition), 70 dB(A) (0 dB[A] acoustic condition) and 74 dB(A) (+4 dB[A] acoustic condition), and six visual stimuli, including the elementary colours of the Natural Colour System (NCS)-yellow (Y), red (R), blue (B), green (G), white (W) and black (S)-were presented to a total of 42 participants. The black colour sample was used to measure the audio-only control condition for the three white noise sound samples. The results of the study reveal that the effects of sound, the effects of colour and the interaction effects of colour and sound on perceived noise annoyance were statistically significant. The effects of colour on the loudness evaluations of the previous studies and the effects of colour on noise annoyance evaluations presented in this study show very similar and concordant results, indicating that the effects of colour on noise annoyance depend on the sound pressure level (SPL). The results indicate that the hue contrasts of red-green, red-blue and yellow-blue and the lightness contrast of yellow-blue influenced perceived noise annoyance when the SPL was low or high. Within the contrast pairs, red and yellow were perceived to be annoying, whereas blue and green were perceived to be non-annoying.
Chapter
Psychoacoustics, as an integral part of neuroscience, is the science that studies the correlation between acoustic stimuli and perceived sensation. In neuroscience, importance is given to the higher functions that occur in the brain, namely in the cerebral cortex. Among the subjective qualities that allow us to describe musical sound, we find pitch, loudness and timbre. Loudness variations are important in musical performance, making it more exciting. Musical dynamics is prescribed in the score through dynamic markings (from pp to ff). The present chapter aims to analyse the relationship between loudness and certain physical parameters, as well as other factors, which can influence loudness perception. Loudness depends mainly on sound intensity (in dB SPL) and corresponds to the perceptual correlate of this physical parameter. However, it also depends on other variables such as frequency, spectrum/spectral bandwidth, duration, context (e.g., room acoustics), and personal factors (e.g., hearing loss). Transformation of the physical sound intensity sensation into loudness perception will only be completed in the auditory cortex.
Article
Full-text available
This paper presents results from a one-year study of indoor annoyance and self-reported sleep times for two participants located near different wind farms. Continuous measurements of outdoor and indoor noise and meteorological conditions were taken at each location for the duration of the study. In at least 50% of the annoyance recordings, participants described noise as “swish” or “swoosh.” Furthermore, the majority of the annoyance recordings occurred at nighttime and in the early morning. The third quartile of A-weighted indoor sound pressure level [SPL(A)], between 27 and 31 dBA, was associated with an 88% increased probability of annoyance compared to the lowest reference quartile, which was between 12 and 22 dBA [odds ratio and 95% confidence intervals, 7.72 (2.61,22.8), p < 0.001]. The outdoor SPL(A) was also predictive of annoyance but only between 40 and 45 dBA. The outdoor prevalence of amplitude modulation (AM), defined as the percentage of time that AM was detectable by an algorithm for each annoyance period, was also associated with annoyance. Self-reported sleep efficiency (time spent asleep relative to time in bed available for sleep) was significantly associated with nighttime annoyance (β = −0.66, p = 0.02) but only explained a small fraction of the variance (R² = 5%).
Article
Full-text available
Resumen Este artículo presenta el marco teórico sobre algunos de los problemas de las ciudades del siglo XXI como son el cambio climático, la urbanización y el envejecimiento de la población, con el fin de establecer los antecedentes que guían la investigación del confort térmico y acústico de las personas mayores en espacios públicos de la ciudad. Para ello, se analizan las políticas internacionales, los conceptos de urbanismo y todas sus vertientes contemporáneas que han surgido como producto de la adaptación de las ciudades a los fenómenos demográficos y climáticos, así como su influencia en la salud de la población. Adicionalmente se conceptualiza el fenómeno del envejecimiento de la población, la realidad mundial, regional y local de España en cuanto a este proceso demográfico y las políticas públicas que se han planteado como medida de adaptación en las ciudades. Finalmente, se explican los conceptos sobre confort térmico y acústico, identificando el estado de la cuestión y las principales variables que influyen, mediante un resumen de la revisión bibliográfica sistemática, sobre la percepción de confort térmico y acústico de adultos mayores y sus diferencias con otros grupos de edad. Abstract This article presents the theoretical framework of some of the problems of 21st century cities, such as climate change, urbanisation and population ageing, in order to establish the background that guides research into the thermal and acoustic comfort of older people in public spaces in the city. To this end, international policies, urban planning concepts and all their contemporary aspects that have arisen as a result of the adaptation of cities to demographic and climatic phenomena, as well as their influence on the health of the population, are analysed. In addition, it conceptualises the phenomenon of population ageing, the global, regional and local reality of Spain in terms of this demographic process and the public policies that have been put forward as a measure of adaptation in cities. Finally, the concepts of thermal and acoustic comfort are explained, identifying the state of the art and the main variables that influence, through a summary of the systematic literature review, the perception of thermal and acoustic comfort of older adults and their differences with other age groups.
Article
Given the high-power concentration of combustion engines used in military aviation, it is reasonable to measure the instantaneous surges in the sound pressure level. Therefore, a research question was raised regarding the differences in this level for various aircraft engines of the same type (F100-PW-229) to assess their size and statistical significance. The aim of the paper is to discuss the attempt to check the significant difference between the parameters of the acoustic level generated by the aircraft engines. The measurements were carried out for 32 engines of the F-16 Block 52+ multirole aircraft during takeoff process. The parameters of noise in the point system and in the octave distribution were subject to analysis. Statistical methods dedicated to assessing production stability, i.e. the Shewhart chart, were applied. The results of the analysis showed that the discrepancies generally do not exceed a value of +/− 3σ . Therefore, it can be concluded that the analogous results for F-16 noise are homogeneous. Thus, the Shewhart chart method proved useful for assessing the homogeneity of these measurements.
Article
Full-text available
Vocal and facial cues typically co-occur in natural settings, and multisensory processing of voice and face relies on their synchronous presentation. Psychological research has examined various facial and vocal cues to attractiveness as well as to judgements of sexual dimorphism, health, and age. However, few studies have investigated the interaction of vocal and facial cues in attractiveness judgments under naturalistic conditions using dynamic, ecologically valid stimuli. Here, we used short videos or audio tracks of females speaking full sentences and used a manipulation of voice pitch to investigate cross-modal interactions of voice pitch on facial attractiveness and related ratings. Male participants had to rate attractiveness, femininity, age, and health of synchronized audio-video recordings or voices only, with either original or modified voice pitch. We expected audio stimuli with increased voice pitch to be rated as more attractive, more feminine, healthier, and younger. If auditory judgements cross-modally influence judgements of facial attributes, we additionally expected the voice pitch manipulation to affect ratings of audiovisual stimulus material. We tested 106 male participants in a within-subject design in two sessions. Analyses revealed that voice recordings with increased voice pitch were perceived to be more feminine and younger, but not more attractive or healthier. When coupled with video recordings, increased pitch lowered perceived age of faces, but did not significantly influence perceived attractiveness, femininity, or health. Our results suggest that our manipulation of voice pitch has a measurable impact on judgements of femininity and age, but does not measurably influence vocal and facial attractiveness in naturalistic conditions.
Article
This study is a review article and aims to examine the historical development process of equal temperament, known as the tuning system of piano and fixed-fretted instruments. Temperament is briefly defined as changing musical intervals from their counterparts in the natural (Just Intonation) scale at certain rates. As of the end of the 1300s, it is known that most keyboard instruments were tempered. The first method described in writing is the temperament called “mean-tone”. In this method, importance was given to the thirds being natural or close to nature, on the other hand, the fifths were slightly lowered. In the mean-tone temperament, some intervals which are excessively high or low and the presence of unusable scales has led to the emergence of irregular temperaments. Although there are more than one interval with the same name in irregular temperaments, this method has been applied to keyboard instruments for about two centuries, since all scales can be used. Even when using mean-tone temperament and irregular temperament, the importance and necessity of equal temperament has always been a subject advocated by theorists. Though equal temperament began to be practiced on fretted instruments in the 1570s, it was accepted much later in keyboard instruments. The cent system, which was introduced by Alexander J. Ellis in 1876, is a turning point in the calculation of equal temperament. With the spread of 12-tone music in international art music in the 20th century, unequal temperaments came to an end, and it was accepted as a common tuning system for keyboard and fretted instruments.
Article
Full-text available
Hearing is one of the human’s foremost sensors; being able to hear again after suffering from a hearing loss is a great achievement, under all circumstances. However, in the long run, users of present-day hearing aids and cochlear implants are generally only halfway satisfied with what the commercial side offers. We demonstrate here that this is due to the failure of a full integration of these devices into the human physiological circuitry. Important parts of the hearing network that remain unestablished are the efferent connections to the cochlea, which strongly affects the faculty of listening. The latter provides the base for coping with the so-called cocktail party problem, or for a full enjoyment of multi-instrumental musical plays. While nature clearly points at how this could be remedied, to achieve this technologically will require the use of advanced high-precision electrodes and high-precision surgery, as we outline here. Corresponding efforts must be pushed forward by coordinated efforts from the side of science, as the commercial players in the field of hearing aids cannot be expected to have a substantial interest in advancements into this direction.
Article
Full-text available
Indoor acoustic environment has become a critical factor in architectural design, and some researchers argued that the reactions from people of varied age, gender, etc. to indoor noise should be considered. While the office staff along metro lines get used to frequent metro noise, their metro noise perceptions, which are supposed to be different from non-office staff, need to be clearly examined. Based on on-site physical measurements and questionnaire surveys, this study aims to analyze the multi-dimensional perceptions (annoyance, dissatisfaction and unpleasantness) of office staff and non-office staff about metro noise in the underground commercial spaces of a high-rise building. The results indicate that due to lower adaptability and tolerance to metro noise, the non-office staff were more sensitive to the change of metro noise than the office staff, and compared with the office staff, the non-office staff expressed obviously more intense multi-dimensional negative moods under the same metro noise environments. Furthermore, for the non-office staff, their annoyance and dissatisfaction ratings due to metro noise correlated well with A-weighted equivalent sound pressure level (LA eq ) and maximum A sound pressure level (LAF max ). Among the psychoacoustic measures, loudness and sharpness mainly influenced their annoyance and dissatisfaction perceptions.
Article
Full-text available
Hearing loss in old age, which often goes untreated, has far-reaching consequences. Furthermore, reduction of cognitive abilities and dementia can also occur, which also affects quality of life. The aim of this study was to investigate the hearing performance of seniors without hearing complaints with respect to speech perception in noise and the ability to localize sounds. Results were tested for correlations with age and cognitive performance. The study included 40 subjects aged between 60 and 90 years (mean age: 69.3 years) with not self-reported hearing problems. The subjects were screened for dementia. Audiological tests included pure-tone audiometry and speech perception in two types of background noise (continuous and amplitude-modulated noise) which was either co-located or spatially separated (multi-source noise field, MSNF) from the target speech. Sound localization ability was assessed and hearing performance was self-evaluated by a questionnaire. Speech in noise and sound localization was compared with young normal hearing adults. Although considering themselves as hearing normal, 17 subjects had at least a mild hearing loss. There was a significant negative correlation between hearing loss and dementia screening (DemTect) score. Speech perception in noise decreased significantly with age. There were significant negative correlations between speech perception in noise and DemTect score for both spatial configurations. Mean SRTs obtained in the co-located noise condition with amplitude-modulated noise were on average 3.1 dB better than with continuous noise. This gap-listening effect was severely diminished compared to a younger normal hearing subject group. In continuous noise, spatial separation of speech and noise led to better SRTs compared to the co-located masker condition. SRTs in MSNF deteriorated in modulated noise compared to continuous noise by 2.6 dB. Highest impact of age was found for speech perception scores using noise stimuli with temporal modulation in binaural test conditions. Mean localization error was in the range of young adults. Mean amount of front/back confusions was 11.5% higher than for young adults. Speech perception tests in the presence of temporally modulated noise can serve as a screening method for early detection of hearing disorders in older adults. This allows for early prescription of hearing aids.
Article
This research aims to study the independent and interaction effects of aural and visual indicators on soundscape descriptors in the residential area. A virtual reality (VR) experiment was conducted to reproduce audio-visual stimuli and to collect subjective data using the questionnaire. Typical visual elements and aural environment in residential areas were reproduced using Unity software. Ten audio-only stimuli and 42 audio-visual stimuli (combination of 14 visual stimuli and three aural stimuli) were evaluated for various soundscape descriptors by 32 normal-hearing participants. There are four research findings. Firstly, a visual distraction was found on the effect of crowd sound on pleasantness. Secondly, the independent effects of indicators were analyzed. The sound of heavy traffic negatively affects pleasantness, while no significant difference has been found between no traffic and light traffic conditions. The crowd sound and PgVI (Playground View Index) positively affect eventfulness, while roughness and GVI (Green View Index) have a negative effect on eventfulness. Thirdly, traffic sound was found as a moderator of other indicators’ effect on pleasantness and eventfulness. For instance, GVI (Green View Index) can positively affect pleasantness in no traffic and light traffic conditions, while GVI does not affect pleasantness in heavy traffic conditions. Lastly, the present study did not find calmness correlated with eventfulness in residential areas, contrary to the previous studies. A possible explanation for this result might be that the correlation between descriptors is context-dependent.
Chapter
Full-text available
The study deals with the actuation sounds of mechanical control elements (stimulus) and the hearing sensations elicited by these stimuli. The singular impulsive sound signals are described with the help of technical and psychoacoustic magnitudes. In addition, 133 participants assessed the auditory perceived quality of the very same control elements in four different passenger cars. The relation between stimuli and user judgement clarifies which parameters have an influence on high perceived quality. Based on the participants’ preferences, favorable value ranges for actuation acoustics can be reported. When developing control elements as a central component of the human-machine interface, OEMs are increasingly striving to ensure customer-relevant value impression.
Article
The occurrence of vocal dimorphism in strigid owls has been known for a long time and it is widely accepted that males emit songs with lower frequency than those of females. However, there are several gaps in our knowledge about this phenomenon, which may be related to a scarcity of sound recordings of sexed individuals and difficulties in obtaining this kind of material. In this paper, we present a novel permutation-based analysis that allows the study of vocal dimorphism using recordings of unsexed individuals presumably forming breeding pairs. We used this method to investigate vocal dimorphism in the Tawny-browed Owl (Pulsatrix koeniswaldiana) alongside more conventional t tests, applied only to a subsample of individuals of known sex. Our results indicate that males of P. koeniswaldiana emit songs with lower frequencies and a narrower frequency range than those of females. There was also some evidence, which emerged only from the analysis of sexed individuals, that males emit longer songs with fewer harmonics. Even though our novel method does not estimate dimorphism directly and does not show its direction, it allows investigating it in a relatively easy and economical way and seems promising to help increase our knowledge about vocal dimorphism in strigids and other taxa.
Article
Purpose Vocal roughness is often present in many voice disorders but the assessment of roughness mainly depends on the subjective auditory-perceptual evaluation and lacks acoustic correlates. This study aimed to apply the concept of roughness in general sound quality perception to vocal roughness assessment and to characterize the relationship between vocal roughness and temporal envelop fluctuation measures obtained from an auditory model. Method Ten /ɑ/ recordings with a wide range of roughness were selected from an existing database. Ten listeners rated the roughness of the recordings in a single-variable matching task. Temporal envelope fluctuations of the recordings were analyzed with an auditory processing model of amplitude modulation that utilizes a modulation filterbank of different modulation frequencies. Pitch strength and the smoothed cepstral peak prominence were also obtained for comparison. Results Individual simple regression models yielded envelope standard deviation from a modulation filter with a low center frequency (64.3 Hz) as a statistically significant predictor of vocal roughness with a strong coefficient of determination ( r ² = .80). Pitch strength and CPPS were not significant predictors of roughness. Conclusion This result supports the possible utility of envelope fluctuation measures from an auditory model as objective correlates of vocal roughness.
Thesis
Full-text available
The goal of technical evolutions in the context of entertainment electronics is to improve the user experience by providing visuals and acoustics in the best possible way. With modern virtual and augmented reality devices and applications the goal of a reproduction indistinguishable from reality became more tangible. When the listener is no longer able to distinguish artificial sound sources from real ones the term auditory illusion is used. In order to achieve such illusions different technical challenges need to be mastered. But the assumption that the exact replica of the ear signals lead to the same perceptions as in the corresponding real-life situation is not correct. Fundamental mechanisms of human perception such as the integration of cues from different modalities and the dependency on expectations and experience add another layer of complexity. These expectations can change depending on prior sound exposure. In the context of spatial hearing this means that listeners are probably able to learn how to interpret spatial cues. Such mechanisms and their effect on the perceived quality of spatial sound reproduction systems are the scope of this work. Perceptual studies investigate the learning of spatial localization cues and adaptation mechanisms related to room acoustic perception. Quality deficits due to mismatched ear signals are measured and it is shown how quality ratings can change depending on training. The results suggest, that learning and adaptation processes are a key factor for the establishment of an auditory illusion. The practical relevance of such effects and their underlying principles are discussed.
Chapter
Full-text available
In 2011, the Coffee Cultural Landscape of Colombia (CCLC) was inscribed in the UNESCO World Heritage List. Several studies have been undertaken to increase its knowledge and promote its conservation and sustainable development; however, there still exists a gap between the knowledge of the visible features of this landscape and the audible ones, which are associated to anthropophonic, geophonic, and, mainly, to biophonic sound-emitting sources. The perception or recording of the audible features in a place has been recently termed as soundscape and is studied by a relatively novel discipline known as ecoacoustics. This chapter is, therefore, aimed to discuss the potential opportunities and challenges of applying ecoacoustic methods—particularly non-negative matrix factorization and acoustic indices—to enrich the study of the CCLC. Essential concepts for both the CCLC and ecoacoustics are also briefly explained, along with an outline of future work directions in short- and long-term perspectives.
Article
Selective listening to different sound sources in complex acoustic scenes is still an urgent challenge for machine hearing. In this study, the separation of two sound sources from monaural mixtures is investigated. An end-to-end fully convolutional time-domain audio separation network (ConvTasNet) is trained by a universal dataset, which includes speech, environmental sounds, and music. After the best performing network, mixtures in the test dataset can obtain an average scale-invariant signal-to-distortion ratio improvement (SI-SDRi) of 11.70 dB, which is comparable with the human performance to separate natural sources. Except for the promising performance, the main contribution of our study is to reveal the underlying separation mechanisms of the network through a series of classical human auditory segregation experiments. Results show that without any biological modeling of the auditory system, the proposed network spontaneously mimics aspects of the auditory system to separate sources. Not only the frequency proximity and harmonicity principles for auditory scene analysis are spontaneously learned by such a pure statistical deep network, but also the frequency selectivity in high and low frequencies and the resolvability of harmonics are precisely simulated. The emergence of deep networks with similar behavior characteristics to human beings provides the possibility to develop a universal network that can be adapted to all scenes and achieve selective listening like the human ear. On the other hand, it also provides a new perspective to the modeling of the auditory system for other problems such as recognition and localization.
Article
Full-text available
Loudness judgments of sounds varying in level across time show a non-uniform temporal weighting, with increased weights assigned to the beginning of the sound (primacy effect). In addition, higher weights are observed for temporal components that are higher in level than the remaining components (loudness dominance). In three experiments, sounds consisting of 100- or 475-ms Gaussian wideband noise segments with random level variations were presented and either none, the first, or a central temporal segment was amplified or attenuated. In Experiment 1, the sounds consisted of four 100-ms segments that were separated by 500-ms gaps. Previous experiments did not show a primacy effect in such a condition. In Experiment 2, four- or ten-100-ms-segment sounds without gaps between the segments were presented to examine the interaction between the primacy effect and level dominance. As expected, for the sounds with segments separated by gaps, no primacy effect was observed, but weights on amplified segments were increased and weights on attenuated segments were decreased. For the sounds with contiguous segments, a primacy effect as well as effects of relative level (similar to those in Experiment 1) were found. For attenuation, the data indicated no substantial interaction between the primacy effect and loudness dominance, whereas for amplification an interaction was present. In Experiment 3, sounds consisting of either four contiguous 100-ms or 475-ms segments, or four 100-ms segments separated by 500-ms gaps were presented. Effects of relative level were more pronounced for the contiguous sounds. Across all three experiments, the effects of relative level were more pronounced for attenuation. In addition, the effects of relative level showed a dependence on the position of the change in level, with opposite direction for attenuation compared to amplification. Some of the results are in accordance with explanations based on masking effects on auditory intensity resolution.
Article
Full-text available
Children with dyslexia have difficulties learning how to read and write. They are often diagnosed after they fail school even if dyslexia is not related to general intelligence. Early screening of dyslexia can prevent the negative side effects of late detection and enables early intervention. In this context, we present an approach for universal screening of dyslexia using machine learning models with data gathered from a web-based language-independent game. We designed the game content taking into consideration the analysis of mistakes of people with dyslexia in different languages and other parameters related to dyslexia like auditory perception as well as visual perception. We did a user study with 313 children (116 with dyslexia) and train predictive machine learning models with the collected data. Our method yields an accuracy of 0.74 for German and 0.69 for Spanish as well as a F1-score of 0.75 for German and 0.75 for Spanish, using Random Forests and Extra Trees, respectively. We also present the collected user data, game content design, potential new auditory input, and knowledge about the design approach for future research to explore universal screening of dyslexia. Universal screening with language-independent content can be used for the screening of pre-readers who do not have any language skills, facilitating a potential early intervention.
Article
Full-text available
When buying a car, the acoustic impression of quality of a vehicle drive train is becoming more and more relevant. The perceived sound quality of the engine unit plays a key role here. Due to the nature of individual background noises, that sound quality is negatively influenced. These noise components, which are perceived as unpleasant, need to be further reduced in the course of vehicle development with the identification and evaluation of disruptive noise components in the overall engine noise being a prerequisite for effective acoustics optimization. In particular, the pulsed ticker noise is classified as particularly annoying in Otto DI engines, which is why this article aims to analyze and evaluate the ticking noise components from the overall noise. For this purpose, an empirical formula was developed which can classify the ticking noise components in terms of their intensity. This is purely perception-based and consists of the impulsiveness, the loudness and the sharpness of the overall engine noise. As with other psychoacoustic evaluation scales, the rating was made from 1 (very ticking) to 10 (not ticking). The ticker noise evaluation formula was then verified on the basis of hearing tests with the help of a jury of experts. According to this, it can be predicted precisely in which engine map areas the ticker noise undermines the pleasantness of the overall engine noise.
Article
Full-text available
In this article, we report on research and creative practice that explores the aesthetic interplay between movement and sound for soft robotics. Our inquiry seeks to interrogate what sound designs might be aesthetically engaging and appropriate for soft robotic movement in a social human-robot interaction setting. We present the design of a soft sound-producing robot, SONŌ, made of pliable and expandable silicone and three sound designs made for this robot. The article comprises an articulation of the underlying design process and results from two empirical interaction experiments (N = 66, N = 60) conducted to evaluate the sound designs. The sound designs did not have statistically significant effects on people’s perception of the social attributes of two different soft robots. Qualitative results, however, indicate that people’s interpretations of the sound designs depend on robot type.
Article
Understanding the comprehensive sound perceptions is a major challenge in translating the affective needs and responses of drivers and passengers into the human‐centered design of seat‐belt warning sound (SBWS). Due to the various sound impressions of SBWS, sound quality has multiple impacts on driving safety, pleasure, stress, and so forth. This paper aims to derive a specified psychological factor structure for the sound perceptions of SBWS and to rebuild the connections between the psychological and acoustic attributes. Kansei (emotional or affective) evaluation is employed to collect the affective responses to 20 sound stimuli from 10 experts and 134 other participants. The perceived sound quality is expressed by 11 psychological attributes: “cheerful,” “interesting,” “lovely,” “comfortable,” “nervous,” “cordial,” “advanced,” “crisp,” “nostalgic,” “aroused,” and “weak.” Russell's emotional state model is also employed to map the sounds into a Pleasure–Arousal Kansei space for further analysis. For acoustic attributes, we measure amplitude, frequency, pause ratio, repetition speed, and element number, which are easy to be tuned according to the design guidelines listed as follows: First, a decrease in sound frequency results in sleepiness perception (less perceived urgency); second, fewer element (peak) number per cycle results in displeasure perception (less satisfaction); third, lower repetition speed results in pleasure and sleepiness perceptions simultaneously. The multiple linear regression model further indicates the quantitative psychoacoustic relationships, which helps sound designers to identify the key acoustic attributes and predict the affective responses for the human‐centered design of SBWS.
Article
Full-text available
This work reviews the literature of 46 peer-reviewed papers and presents the current status on the use of psychoacoustic indicators in soundscape studies. The selection of papers for a systematic review followed the PRISMA method. Afterwards, descriptive analysis and principal component analysis (PCA) were realised. For the PCA, the following parameters extracted from the papers were analysed: psychoacoustic indicator, hypothesis, statistical units, data collection method and major findings for each investigated psychoacoustic indicator. The results show an overview of the use of psychoacoustic indicators, through main hypothesis and findings for each psychoacoustic indicator i.e. the importance of statistical units, such as percentiles, to investigate the hypothesis related to the description of auditory descriptors and perceptual attributes. Another important finding is that many papers lack the specification of computation methods limiting the comparability of study results and impeding the meta-analyses.
Article
Several rotors of axial-flow fans and side channel blowers with optimal circumferential spacing have been tested in a hemi-anechoic chamber; such rotors have been designed by means of an existing method based on the minimization of the tonal noise peaks prominence from the broadband spectrum. The resulting SPL spectra contain different components: tonal ones, which are a well-known cause of annoyance (intended as an undesired feature of the received noise related to the short-term exposure), broadband ones, which could have a positive masking effect, and further components due to the electric motor. As expected, the blade spacing strongly affects the tonal noise part of the spectrum, which is mainly of aeroacoustic origin; in axial-flow fans, it also affects the low-frequency broadband part related to the leakage flow. The present study has also confirmed the good consistency between measured spectra and theoretically predicted ones even for side channel blowers. The characteristic curves of the tested machines are not significantly affected, but the quality of the radiated noise considerably changes as the spacing non-uniformity increases: tonalness decreases and roughness increases, fluctuation has not a systematic behavior, and no significant variation in other perceived characteristics may be noticed. In order to quantify such perception, the psychoacoustic parameters commonly employed for automotive cooling fans have been computed. The tonalness decrease leads to a significant annoyance reduction, while the increase in roughness seems of minor importance from a noise annoyance standpoint. Nevertheless, in strongly asymmetric rotors, the change in the perceived acoustic signature may be relevant and an unaware listener could interpret the roughness increase as a potential mechanical malfunction. As for generality of the present discussion, it should be considered that the tested rotors have been optimized in order to reduce the tonal noise peaks prominence, which obviously results in a reduction in the perceived tonalness. On the contrary, spacing obtained by means of different criteria could yield different variations in tonalness, and the same likely happens to loudness. Instead, the rise of peaks at shaft frequency harmonics is a direct consequence of the uneven spacing which should sistematically yield a roughness increase.
ResearchGate has not been able to resolve any references for this publication.