Adam Weisser’s research while affiliated with Macquarie University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


Illustration of the experimental setup. On the left, pairs were standing and free to move around while having an unrestricted conversation. On the right, pairs were seated during conversations. Acoustically transparent headphones delivered identical 3D simulated real-world acoustic environments to each person and motion sensors were attached to track head movement. Calibrated close-talk (boom) microphones recorded speech signals. Each person wore a pouch that housed the wireless receiver for the headphones and transmitter for the microphones.
The adaptive behavior of interpersonal distance (A and B), and speech level (D and E) as a function of time with background noise and talker configuration as parameters. (C) and (F) illustrate the mean change (error bars depict standard errors) in interpersonal distance and speech levels (the difference between the final and starting states).
(A, B and C) show the mean and standard error of the three main interpersonal motor coordination measures of conversational partners as a function of background noise level and talker configuration, with dashed regression lines to show statistical modeling. (A) shows the similarity of movement (%REC), (B) the structural organization (%DET), and (C) the stability of coordination (MAXLINE). The right panels illustrate the temporal fluctuations across background noise levels and talker configurations where appropriate. (D) is a boxplot of the amplitude fluctuation of the background noise, (E) shows a boxplot of the amplitude fluctuation of the median speech levels and (F) depicts Pearson’s product-moment correlations between the background noise envelope (i.e., level fluctuations) and the time-aligned median speech envelope with error bars denoting 95% confidence intervals and those marked with asterisks highlighting significant p-values.
Scatterplots showing the relationship between the short-term background noise level and speech levels.
Total number of communication breakdowns while participants were in the seated or standing configuration across different levels of background noise.
Behavioral dynamics of conversation, (mis)communication and coordination in noisy environments
  • Article
  • Full-text available

November 2023

·

70 Reads

·

3 Citations

·

Adam Weisser

·

·

[...]

·

Joerg M. Buchholz

During conversations people coordinate simultaneous channels of verbal and nonverbal information to hear and be heard. But the presence of background noise levels such as those found in cafes and restaurants can be a barrier to conversational success. Here, we used speech and motion-tracking to reveal the reciprocal processes people use to communicate in noisy environments. Conversations between twenty-two pairs of typical-hearing adults were elicited under different conditions of background noise, while standing or sitting around a table. With the onset of background noise, pairs rapidly adjusted their interpersonal distance and speech level, with the degree of initial change dependent on noise level and talker configuration. Following this transient phase, pairs settled into a sustaining phase in which reciprocal speech and movement-based coordination processes synergistically maintained effective communication, again with the magnitude of stability of these coordination processes covarying with noise level and talker configuration. Finally, as communication breakdowns increased at high noise levels, pairs exhibited resetting behaviors to help restore communication—decreasing interpersonal distance and/or increasing speech levels in response to communication breakdowns. Approximately 78 dB SPL defined a threshold where behavioral processes were no longer sufficient for maintaining effective conversation and communication breakdowns rapidly increased.

Download

Cocktail-party behavior: dynamics of conversation and (mis)communication in noisy environments

February 2023

·

13 Reads

During conversations people effortlessly coordinate simultaneous channels of verbal and nonverbal information to hear and be heard. But the presence of common background noise levels such as those found in cafes and restaurants can be a barrier to conversational success. Here, we used speech and motion-tracking to reveal the behavioral processes that talkers use to communicate effectively during conversations in noisy environments. Natural speech communication of twenty-two pairs of normal- hearing adults was elicited under different conditions of realistic background noise, while standing freely or sitting around a table. The results revealed how the behavior of conversing partners entail three phases of adaptive behavior. First, with the onset of background noise, pairs rapidly adjusted their interpersonal distance and speech level, with the degree of initial change dependent on noise level and talker configuration. Following this transient phase of behavioral adaptation, pairs settled into a steady- state or sustaining phase of behavioral coordination, in which reciprocal speech and movement-based coordination processes operated to synergistically maintain effective communication, again with the magnitude of stability of these coordination processes covarying with noise level and talker configuration. Finally, as communication breakdowns started to increase at high levels of background noise, pairs exhibited intermittent resetting behaviors to help restore communication. That is, individuals further decreased interpersonal distance and/or increased speech levels in direct response to communication breakdowns. The findings show that approximately 78 dB SPL defines a critical noise threshold where behavioral coordination processes are no longer sufficient for maintaining effective conversation and communication breakdowns rapidly increase.


FIG. 3. Similar to Fig. 1, except that here the receiver-related speech levels were derived from Fig. 1 by taking into account the talker distance data from Fig. 2 (see the text). The dashed lines are identical in all panels and refer to a second-order polynomial fit to the average data across all conditions and groups.
Conversational distance adaptation in noise and its effect on signal-to-noise ratio in realistic listening environments

April 2021

·

318 Reads

·

7 Citations

The Journal of the Acoustical Society of America

Everyday environments impose acoustical conditions on speech communication that require interlocutors to adapt their behavior to be able to hear and to be heard. Past research has focused mainly on the adaptation of speech level, while few studies investigated how interlocutors adapt their conversational distance as a function of noise level. Similarly, no study tested the interaction between distance and speech level adaptation in noise. In the present study, participant pairs held natural conversations while binaurally listening to identical noise recordings of different realistic environments (range of 53–92 dB sound pressure level), using acoustically transparent headphones. Conversations were in standing or sitting (at a table) conditions. Interlocutor distances were tracked using wireless motion-capture equipment, which allowed subjects to move closer or farther from each other. The results show that talkers adapt their voices mainly according to the noise conditions and much less according to distance. Distance adaptation was highest in the standing condition. Consequently, mainly in the loudest environments, listeners were able to improve the signal-to-noise ratio (SNR) at the receiver location in the standing condition compared to the sitting condition, which became less negative. Analytical approximations are provided for the conversational distance as well as the receiver-related speech and SNR.


Complex Acoustic Environments: Review, Framework, and Subjective Model

December 2019

·

118 Reads

·

16 Citations

The concept of complex acoustic environments has appeared in several unrelated research areas within acoustics in different variations. Based on a review of the usage and evolution of this concept in the literature, a relevant framework was developed, which includes nine broad characteristics that are thought to drive the complexity of acoustic scenes. The framework was then used to study the most relevant characteristics for stimuli of realistic, everyday, acoustic scenes: multiple sources, source diversity, reverberation, and the listener’s task. The effect of these characteristics on perceived scene complexity was then evaluated in an exploratory study that reproduced the same stimuli with a three-dimensional loudspeaker array inside an anechoic chamber. Sixty-five subjects listened to the scenes and for each one had to rate 29 attributes, including complexity, both with and without target speech in the scenes. The data were analyzed using three-way principal component analysis with a (2 3 2) Tucker3 model in the dimensions of scales (or ratings), scenes, and subjects, explaining 42% of variation in the data. “Comfort” and “variability” were the dominant scale components, which span the perceived complexity. Interaction effects were observed, including the additional task of attending to target speech that shifted the complexity rating closer to the comfort scale. Also, speech contained in the background scenes introduced a second subject component, which suggests that some subjects are more distracted than others by background speech when listening to target speech. The results are interpreted in light of the proposed framework.


The Ambisonic Recordings of Typical Environments (ARTE) Database

July 2019

·

162 Reads

·

40 Citations

Acta Acustica united with Acustica

Everyday listening environments are characterized by far more complex spatial, spectral and temporal sound field distributions than the acoustic stimuli that are typically employed in controlled laboratory settings. As such, the reproduction of acoustic listening environments has become important for several research avenues related to sound perception, such as hearing loss rehabilitation, soundscapes, speech communication, auditory scene analysis, automatic scene classification, and room acoustics. However, the recordings of acoustic environments that are used as test material in these research areas are usually designed specifically for one study, or are provided in custom databases that cannot be universally adapted, beyond their original application. In this work we present the Ambisonic Recordings of Typical Environments (ARTE) database, which addresses several research needs simultaneously: realistic audio recordings that can be reproduced in 3D, 2D, or binaurally, with known acoustic properties, including absolute level and room impulse response. Multichannel higher-order ambisonic recordings of 13 realistic typical environments (e.g., office, cafè, dinner party, train station) were processed, acoustically analyzed, and subjectively evaluated to determine their perceived identity. The recordings are delivered in a generic format that may be reproduced with different hardware setups, and may also be used in binaural, or single-channel setups. Room impulse responses, as well as detailed acoustic analyses, of all environments supplement the recordings. The database is made open to the research community with the explicit intention to expand it in the future and include more scenes.


Figure 2: Different levels of information flow of human speech. The lowest level (innermost in the figure) is the exact acoustic signal that the source transmits to the receiver (the talker says to the listener). This signal contains the highest information rate. One level up (outward), the motor system is responsible for modulating the airflow that results in speech production, which is maybe mirrored in speech perception as well (Liberman et al. 1967, Poeppel et al. 2007), among other parallel nonacoustic channels (e.g., Sumby & Pollack 1954, McGurk & MacDonald 1976). The linguistic level is one level higher, which already has much lower and more efficient informational content than the acoustic signal (Chomsky 1956). Finally, on an abstract psychological level, there is thought (goals, emotion, motivation, etc.) that generates / parses the messages to / from the linguistic modality. All information transfers that occurs within the brain are coded in intermediate neural signals, which may also include feedback and feedforward loops. Noise sources for all channels are not shown.
Auditory information loss in real-world listening environments

February 2019

·

183 Reads

Whether animal or speech communication, environmental sounds, or music -- all sounds carry some information. Sound sources are embedded in acoustic environments that contain any number of additional sources that emit sounds that reach the listener's ears concurrently. It is up to the listener to decode the acoustic informational mix, determine which sources are of interest, decide whether extra resources should be allocated to extracting more information from them, or act upon them. While decision making is a high-level process that is accomplished by the listener's cognition, selection and elimination of acoustic information is manifest along the entire auditory system, from periphery to cortex. This review examines latent informational paradigms in hearing research and demonstrates how several hearing mechanisms conspire to gradually eliminate information from the auditory sensory channel. It is motivated through the computational need of the brain to decomplexify unpredictable real-world signals in real time. Decomplexification through information loss is suggested to constitute a unifying principle of the mammalian hearing system, which is specifically demonstrated in human hearing. This perspective can be readily generalised to other sensory modalities.


Conversational speech levels and signal-to-noise ratios in realistic acoustic conditions

January 2019

·

148 Reads

·

46 Citations

The Journal of the Acoustical Society of America

Estimating the basic acoustic parameters of conversational speech in noisy real-world conditions has been an elusive task in hearing research. Nevertheless, these data are essential ingredients for speech intelligibility tests and fitting rules for hearing aids. Previous surveys did not provide clear methodology for their acoustic measurements and setups, were opaque about their samples, or did not control for distance between the talker and listener, even though people are known to adapt their distance in noisy conversations. In the present study, conversations were elicited between pairs of people by asking them to play a collaborative game that required them to communicate. While performing this task, the subjects listened to binaural recordings of different everyday scenes, which were presented to them at their original sound pressure level (SPL) via highly open headphones. Their voices were recorded separately using calibrated headset microphones. The subjects were seated inside an anechoic chamber at 1 and 0.5 m distances. Precise estimates of realistic speech levels and signal-to-noise ratios (SNRs) were obtained for the different acoustic scenes, at broadband and third octave levels. It is shown that with acoustic background noise at above approximately 69 dB SPL at 1 m distance, or 75 dB SPL at 0.5 m, the average SNR can become negative. It is shown through interpolation of the two conditions that if the conversation partners would have been allowed to optimize their positions by moving closer to each other, then positive SNRs should be only observed above 75 dB SPL. The implications of the results on speech tests and hearing aid fitting rules are discussed.


Subjective attributes of realistic sound fields: What makes acoustic environments complex?

October 2016

·

11 Reads

The Journal of the Acoustical Society of America

Real-life acoustic environments are commonly considered to be complex, because they contain multiple sound sources in different locations, room reverberation, and movement—all at continuously varying levels. In comparison, most laboratory-generated sound fields in hearing research are much simpler. However, the continuum between simple and complex fields is likely to be a function of multiple instrumental as well as perceptual factors. Although most of the applied knowledge has been gathered about simple, controlled, and synthetic acoustic environments, inferring listeners’ performance and perception from that to realistic scenarios is non-trivial. In order to clarify some of these relationships, subjective responses to twelve virtualized acoustic environments were examined using exploratory questionnaires with both descriptive and rating questions. These were specifically designed for the purpose of identifying attributes that may correlate with perceived acoustic complexity. Environments were recorded and processed using higher-order ambisonics, and reproduced in a 41-loudspeaker array in an anechoic chamber. Eleven environments spanned from a quiet library to a loud food court and diffuse noise was used as a control environment. Attributes included questions about busyness, variability, reverberance, annoyance, and others. The results enable us to break down the perceived complexity to its constituent factors.


Estimating realistic speech levels for virtual acoustic environments

October 2016

·

24 Reads

·

1 Citation

The Journal of the Acoustical Society of America

Speech intelligibility is commonly assessed in rather unrealistic acoustic environments at negative signal-to-noise ratios (SNRs). As a consequence, the results seem unlikely to reflect the subjects’ experience in the real world. To improve the ecological validity of speech tests, different sound reproduction techniques have been used by researchers to recreate field-recorded acoustic environments in the laboratory. Whereas the real-world sound pressure levels of these environments are usually known, this is not necessarily the case for the level of the target speech (and therefore the SNR). In this study, a two-talker conversation task is used to derive realistic target speech levels for given virtual acoustic environments. The talkers communicate with each other while listening to binaural recordings of the environments using highly open headphones. During the conversation their speech is recorded using close-talk microphones. Conversations between ten pairs of young normal-hearing talkers were recorded in this way in 12 different environments and the corresponding speech levels were derived. In this presentation, the methods are introduced and the derived speech levels are compared to results from the literature as well as from real sound-field recordings. The possibility of using this technique to generate environment-specific speech material with realistic vocal effort is discussed.

Citations (6)


... They found that these visual stimuli, perceived as noise, increased interpersonal head-movement coordination. Similarly, Miles et al. [38] reported a significant increase in interpersonal body-movement coordination between participants when auditory background noise was present. They interpreted that participants more closely coupled their movements to each other when verbal communication became more difficult. ...

Reference:

Effects of Delay on Nonverbal Behavior and Interpersonal Coordination in Video Conferencing
Behavioral dynamics of conversation, (mis)communication and coordination in noisy environments

... Whitmer et al. 11 found that hearing instruments with a lower spectral bandwidth did not compromise listeners' ability to accurately locate a sound source but did affect their motion behavior, such as delayed movement onset. Moving closer to the source generally increases its level and might serve as a social signal for listening difficulties, as suggested by Hadley et al. 12 and Weisser et al. 13 , who revealed that listeners move closer to each other in situations where communication becomes more challenging. This evidence highlights that an OPEN www.nature.com/scientificreports/ ...

Conversational distance adaptation in noise and its effect on signal-to-noise ratio in realistic listening environments

The Journal of the Acoustical Society of America

... • sources distributed in space; acoustic source diversity; source-source interaction; reverberation; reflections; scattering; diffraction; and diffusion, a nonuniform medium for sound propagation (Weisser et al., 2019). Listening is affected by not only the background noise and reverberation but also the different learning activities, student and teacher voices, student and teacher interactions, distances from the speaker, and multimodal factors such as gaze and gesture. ...

Complex Acoustic Environments: Review, Framework, and Subjective Model

... Nevertheless, they lay a solid foundation for soundscape research. As suggested by Weisser et al. (2019), databases should fulfill the following four minimum requirements required for a database to be suitable for hearing research: (1) the recording files are provided in a format that allows accurate spatial reproduction of the scenes via headphones and loudspeaker arrays; (2) the sound pressure levels (SPL) are provided enabling reproduction at original levels; (3) scene descriptors are included allowing for selection according to preference and comparison between the databases; (4) recorded or simulated room impulse responses are provided or accessible to add other acoustic material to the scenes, as required, for instance for implementing a speech-in-noise test. Although these guidelines were developed for hearing research in the context of mainly speech communication, they serve well as basis for more general research on auditory perception. ...

The Ambisonic Recordings of Typical Environments (ARTE) Database
  • Citing Article
  • July 2019

Acta Acustica united with Acustica

Conversational speech levels and signal-to-noise ratios in realistic acoustic conditions
  • Citing Article
  • January 2019

The Journal of the Acoustical Society of America

... It might well be that the failure to demonstrate benefits in terms of speech recognition scores is due to ceiling effects: the in studies commonly used measure of speech reception threshold (SRT) in people with mild to moderate hearing impairment corresponds to SNRs of between -10dB and 0dB. NR algorithms, however, have been shown to be most effective in positive SNRs (Fredelake, Holube, Schlueter, & Hansen, 2012;Smeds, Wolters, & Rung, 2015), which are in turn by far the most common signal to noise ratios listeners are exposed to in real life (Buchholz et al., 2016;Smeds et al., 2015). ...

Estimating realistic speech levels for virtual acoustic environments
  • Citing Article
  • October 2016

The Journal of the Acoustical Society of America