About
128
Publications
65,359
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
802
Citations
Introduction
Current institution
Additional affiliations
April 2018 - present
September 2006 - December 2009
January 2010 - present
Publications
Publications (128)
This paper presents the Cadenza Woodwind Dataset. This publicly available data is synthesised audio for woodwind quartets including renderings of each instrument in isolation. The data was created to be used as training data within Cadenza's second open machine learning challenge (CAD2) for the task on rebalancing classical music ensembles. The dat...
Understanding lyrics is a major barrier to enjoying music for people with hearing loss. To improve lyric understanding through machine-learning, metrics need to be informed by the experiences of the target population. Currently, there are no data on lyric-recall ability of older individuals with hearing loss. Twelve older participants with mostly m...
It is well established that listening to music is an issue for those with hearing loss, and hearing aids are not a universal solution. How can machine learning be used to address this? This paper details the first application of the open challenge methodology to use machine learning to improve audio quality of music for those with hearing loss. The...
The study of the perceived affective qualities (PAQs) in soundscape assessments have increased in recent years, with methods varying from in-situ to laboratory. Through technological advances, virtual reality (VR) has facilitated evaluations of multiple locations in the same experiment. In this paper, VR reproductions of different urban sites were...
The Cadenza project is an ongoing project that aims to improve music quality for those with a hearing loss. The project is running signal-processing and machine-learning challenges to address different listening issues and scenarios. During the first round, the challenge focused on non-causal music source separation to allow remixing for those with...
Introduction
Previous work on audio quality evaluation has demonstrated a developing convergence of the key perceptual attributes underlying judgments of quality, such as timbral, spatial and technical attributes. However, across existing research there remains a limited understanding of the crucial perceptual attributes that inform audio quality e...
Interior car noise refers to the general noise generated by the engine transmission, the interaction between road and types, and weather conditions such as turbulent wind. For drivers or passengers with hearing loss, these can create especially challenging listening situations. The Cadenza Project is organising a series of machine learning challeng...
This paper presents results from the Manchester Soundscape Experiment Online 2020. It consisted of a virtual reality (VR) experiment online where participants rated 12 different scenarios with questions. The selected locations were Piccadilly Gardens, Market Street, Peel Park, and a bus stop. Each site was visited and recorded with a 360 camera and...
Given the restrictions due to the global pandemic, online listening tests become an alternative way of collecting data. In adaptation, the Manchester Soundscape Experiment Online had a contribution of 158 participants between the months of August to November 2020. The objective was to investigate whether different soundscapes and crowd densities mo...
With social rituals usually involving sound, an archaeological understanding of a site requires the acoustics to be assessed. This paper demonstrates how this can be done with acoustic scale models. Scale modelling is an established method in architectural acoustics, but it has not previously been applied to prehistoric monuments. The Stonehenge mo...
Music has been shown to be capable of improving runners’ performance in treadmill and laboratory-based experiments. This paper evaluates a generative music system, namely HEARTBEATS, designed to create biosignal synchronous music in real-time according to an individual athlete’s heartrate or cadence (steps per minute). The tempo, melody, and timbra...
Sounds in urban areas have traditionally been treated as an annoyance for which noise control solutions aim mostly at reducing sound levels. However, recent studies demonstrate that soundscapes could also enhance the quality of life and become a resource for urban planning. This work aims to investigate how human presence in urban settings can modu...
Pupil dilation has previously been shown to be a useful involuntary marker of listening effort. An inverse relationship between pupil diameter and signal to noise ratio has been shown when speech is energetically masked by noise. The work reported here aimed to investigate whether this relationship also holds for informational masking. Informationa...
One of the advantages of object-based audio/broadcast over traditional channel-based delivery is that it allows for the rendering of personalized content when delivered to the listeners. The methods by which personalization are achieved often require an in-depth understanding of the problem domain. This paper describes the design and evaluation of...
With the purpose of assisting developments in urban soundscape design, the present research investigates how soundscapes can influence emotions and behaviours in public spaces aiming to find healthy reactions to “exciting” sonic environments. In the current study, an “exciting” soundscape represents sounds with high levels of pleasantness and event...
Auditory salience describes the extent to which sounds attract the listener’s attention. So far, there have not been any published studies testing if the location of sound relative to the listener influences its salience. In fact, not many experiments in general test auditory attention in a fully spatialised setting, with sounds in front and behind...
To gain better speech intelligibility and overall listening experience in broadcasts in which background sounds are accessible, changing the background rather than the foreground speech signal may be a less intrusive approach than the converse. In this study, the technique of spectral weighting was applied to the background. The frequency-dependent...
Acoustic Event Detection (AED) is an important task of machine
listening which, in recent years, has been addressed using common
machine learning methods like Non-negative Matrix Factorization
(NMF) or deep learning. However, most of these approaches do not
take into consideration the way that human auditory system detects
salient sounds. In this w...
Mobile devices are now ubiquitous in daily life and the number of activities that can be performed using them is continually growing. This implies increased attention being placed on the device and diverted away from events taking place in the surrounding environment. The impact of using a smartphone on pedestrians in the vicinity of urban traffic...
O som é um aspecto crucial para proporcionar imersão em diversas áreas do entretenimento, afetando a experiência do usuário estética e fisicamente, principalmente no âmbito das diferentes tecnologias de reprodução disponíveis. Uma dessas tecnologias é a técnica biauricular, a qual já apresenta aplicações no entretenimento, como na música, por exemp...
Object-based audio can be used to customize, personalize, and optimize audio reproduction depending on the specific listening scenario. To investigate and exploit the benefits of object-based audio, a framework for intelligent metadata adaptation was developed. The framework uses detailed semantic metadata that describes the audio objects, the loud...
Can externalizing dialogue when in the presence of stereo background noise improve speech intelligibility? This has been investigated for audio over headphones using head-tracking in order to explore potential future developments for small-screen devices. A quantitative listening experiment tasked participants with identifying target words in spoke...
While mixing, sound producers and audio professionals empirically set the speech-to- background ratio (SBR) based on rules of thumb and their own perception of sounds. There is no guarantee that the speech content will be intelligible for the general population consuming content over a wide variety of devices, however. In this study, an approach to...
Featured Application
The numerical methods described in this paper can be used in the automatic creation of artificial datasets of audio mixes, as real-world mixes are both scarce and costly to produce. Such datasets can be used for a variety of applications, such as material for signal analysis, audio stimuli in psychoacoustic testing or as a popu...
Music and audio applications are well suited to tactile control. In sound and music computing there can be a disconnect between design of human-computer interfacing and application congruent design. A categorical approach is proposed, considering active and passive control methods. This work has implications for the design of adaptive or ‘on-the-fl...
Intelligent music production tools aim to assist the user by automating music production tasks. Many previous systems sought to create the best possible mix based on technical parameters but rarely has subjectivity been directly incorporated. This paper proposes that a new generation of tools can be designed based on evolutionary computation method...
During the 1980 s, acoustic studies of Upper Palaeolithic imagery in French caves—using the technology then available—suggested a relationship between acoustic response and the location of visual motifs. This paper presents an investigation, using modern acoustic measurement techniques, into such relationships within the caves of La Garma, Las Chim...
A review of the evolution and acoustics of ancient open-air theatres is undertaken. The acoustics of ancient open-air theatres are characterized by remarkably high speech intelligibility. This is due to the seating area (Cavea) and Orchestra. Thus, the acoustics of the Cavea is evaluated from the literature and through measurements. Geometrical aco...
A review of the evolution and acoustics of ancient open-air theatres is undertaken. The acoustics of ancient open-air theatres are characterized by remarkably high speech intelligibility. This is due to the seating area (Cavea) and Orchestra. Thus, the acoustics of the Cavea is evaluated from the literature and through measurements. Geometrical aco...
It has been reported by numerous studies on distance perception in VR that a compression of visual space occurs in virtual environments presented using stereoscopic techniques. Other studies have shown that modified environmental auditory cues can affect egocentric spatial perception and that increased order of modality improved the experience of i...
No PDF available
ABSTRACT
Hitherto, not many studies have dealt with spatial auditory saliency. Auditory attention studies concerned with spatial aspects generally concentrate on top-down selective or divided attention, e.g., where subjects are asked to attend to one source at a specific location whilst being distracted with sources from different...
Previous archaeoacoustics work published from the 1980s to the 2000s has suggested that the location of palaeolithic paintings in French caves, such as Le Portel, Niaux, Isturitz, and Arcy-sur-Cureis, are associated with the acoustic response of those locations, particularly with strong low frequency resonances. Recent work done in caves in the Ast...
A novel methodology for intelligent music production has been developed using evolutionary computation. Mixes are generated by exploration of a "mix-space " , which consists of a series of inter-channel volume ratios, allowing efficient generation of random mixes. An interactive genetic algorithm was used, allowing the user to rate mixes and guide...
A distortion-weighted glimpse proportion metric (BiDWGP) for predicting binaural speech intelligibility were evaluated in simulated anechoic and reverberant conditions, with and without a noise masker. The predictive performance of BiDWGP was compared to four reference binaural intelligibility metrics, which were extended from the Speech Intelligib...
One criterion in the design of binaural sound scenes in audio production is the extent to which the intended speech message is correctly understood. Object-based audio broadcasting systems have permitted sound editors to gain more access to the metadata (e.g., intensity and location) of each sound source, providing better control over speech intell...
To further the development of intelligent music production tools towards generating mixes that would realistically be created by a human mix-engineer, it is important to understand what kind of mixes can be created, and are typically created, by human mix-engineers. This paper presents an analysis of 1501 mixes, over 10 different songs, created by...
The quality of recorded music is often highly disputed. To gain insight into the dimensions of quality perception, subjective and objective evaluation of musical program material, extracted from commercial CDs, was undertaken. It was observed that perception of audio quality and liking of the music can be affected by separate factors. Familiarity w...
The act of mix-engineering is a complex combination of creative and technical processes; analysis is often performed by studying the techniques of a few expert practitioners, qualitatively. We propose to study the actions of a large group of mix-engineers of varying experience, introducing quantitative methodology to investigate mix-variation and t...
The act of mix-engineering is a complex combination of creative and technical processes; analysis is often performed by studying the techniques of a few expert practitioners, qualitatively. We propose to study the actions of a large group of mix-engineers of varying experience, introducing quantitative methodology to investigate mix-variation and t...
A psychoacoustic experiment was carried out to test the effects of microphone handling noise on perceived audio quality. Handling noise is a problem affecting both amateurs using their smartphones and cameras, as well as professionals using separate microphones and digital recorders. The noises used for the tests were measured from a variety of dev...
For field recordings and user generated content recorded on phones, tablets, and other mobile devices nonlinear distortions caused by clipping and limiting at pre-amplification stages, and dynamic range control (DRC) are common causes of poor audio quality. A single-ended method to detect these distortions and predict perceived degradation in speec...
Many of us now carry around technologies which allow us to record sound, whether that is the sound of our
child's first music concert on a digital camera or a recording of a practical joke on a mobile phone.
However, the production quality of the sound on user-generated content is often very poor: distorted, noisy,
with garbled speech or indistinct...
While many listeners can determine varying levels of audio quality what is not always clear is what criteria has been used in the decision-making process. A subjective listening test has been undertaken in which participants rated the audio quality of 62 samples of commercially available popular music, from 1982 to 2013. In addition to providing a...
While many listeners can determine varying levels of audio quality what is not always clear is what criteria has been used in the decision-making process. A subjective listening test has been undertaken in which participants rated the audio quality of 62 samples of commercially available popular music, from 1982 to 2013. In addition to providing a...
The mixing of audio signals has been at the foundation of audio production since the advent of electrical recording in the 1920's, yet the mathematical and psychological bases for this activity are relatively under-studied. This paper investigates how the process of mixing music is conducted. We introduce a method of transformation from a " gain-sp...
Room modes cause audible artifacts in listening environments. Modal control approaches have emerged in scientific literature over the years and, often, their performance is measured by criteria that may be perceptually unfounded. Previous research has shown modal decay as a key perceptual factor in detecting modal effects. In this work, perceptual...
Among the various wave-based simulation methods, the finite difference time domain (FDTD) method provides a reasonable trade-off between applicability, accuracy and computational efficiency. With the growing availability of computing power and recent advances in parallel architectures, FDTD has become a feasible choice for room acoustics simulation...
Wind can induce noise on microphones, causing problems for users of hearing aids and for those making recordings outdoors. Perceptual tests in the laboratory and via the Internet were carried out to understand what features of wind noise are important to the perceived audio quality of speech recordings. The average A-weighted sound pressure level o...
Since digital audio is encoded as discrete samples of the audio waveform, much can be said about a recording by the statistical properties of these samples. In this paper, a dataset of CD audio samples is analysed; the probability mass function of each audio clip informs a feature set which describes attributes of the musical recording related to l...
Low-frequency acoustic effects have been well documented in archaeological studies for nearly two decades. However, to date, specialist acoustic input into the field of archaeoacoustics has been in the minority of research effort put into this emerging branch of scientific study. This project aims to investigate the initial findings regarding low-f...
In finite difference time domain simulation of room acoustics, source functions are subject to various constraints. These depend on the way sources are injected into the grid and on the chosen parameters of the numerical scheme being used. This paper addresses the issue of selecting and designing sources for finite difference simulation, by first r...
A dataset of audio clips was prepared and audio quality assessed by subjective testing. Encoded as digital signals, a large amount of feature-extraction was possible. A new objective metric is proposed , describing the Gaussian nature of a signal's amplitude distribution. Correlations between objective measurements of the music signals and the subj...
Wind-induced microphone noise is one of the most common problems leading to poor audio quality in recordings. A wind-noise detector could alert the operator of a recording device to the presence of wind noise so that appropriate action can be taken. This paper presents a single channel algorithm which, within the presence of other sounds, detects a...
***AUTHOR'S NOTE: This is a POMA (ICA) conference paper not a JASA paper. This cannot be changed in ResearchGate(!)******
Binaural room impulse responses are important for auralization as well as for objective research in room acoustics. In geometrical room simulation methods, obtaining such responses is easily achieved by convolving each computed...
This paper will present results from a systematic investigation into functional and aesthetic audio quality of speech recordings degraded by wind noise. The major source of wind noise tested comes from velocity fluctuations interacting with the transducer, generating pressure fluctuations at the microphone diaphragm. To better understand the effect...
A research is being undertaken to unravel the acoustic response of Stonehenge, which is the largest and most complex ancient stone circle known to mankind. Perhaps the first time acoustic effects at Stonehenge were noticed was during its first phase of construction which corresponds to the bank and ditch and the 56 Aubrey holes which are now believ...
Accurate distance cues are important in the degree of realism provided by virtual
audio systems. In the last decade there has been an increased interest in this research
area. The main focus of this research project is to investigate the effect of different
acoustic cues related to distance perception, such as Direct to Reverberant ratio (D/R),
in...
The Finite Difference Time Domain (FDTD) method is becoming increasingly popular for room acoustics simulation. Yet, the literature on grid excitation methods is relatively sparse, and source functions are traditionally implemented in a hard or additive form using arbitrarily-shaped functions which do not necessarily obey the physical laws of sound...
As a significant and growing source of the world's energy, wind turbine reliability is becoming a major concern. At least two fault detection techniques for condition monitoring of wind turbine blades have been reported in early literature, i.e. acoustic emissions and optical strain sensors. These require off-site measurement. The work presented he...
In small rooms, low-frequency modes have a degrading influence on the quality of the bass components of music. Using objective measures to correct these modes often fails because they do not correspond to the subjective experience of listeners. This research begins with a procedure that elicits a compact set of four verbal descriptors from subjects...
An initial investigation into the performance of acoustic condition monitoring in the detection of structural faults in turbine blades has been carried out. The focus is to design a non-contact condition monitoring method which might allow the detection of incipient faults in the turbine blades therefore preventing major breakdown and potentially r...
With the rapid growth of computational power and recent advances in GP-GPU technology, numerical time domain methods are becoming increasingly popular for room acoustics applications due to their accuracy, simplicity and ease of implementation. However, in order to model realistic spaces one should consider boundary conditions and source directivit...
This paper describes and evaluates an objective measurement that grades the quality of a complex musical signal. The authors have previously identified a potential correlation between inter-band dynamics and the subjective quality of produced music excerpts. This paper describes the previously presented Inter-Band Relationship (IBR) descriptor and...
Room modes are well known to cause unwanted effects in the correct reproduction of low frequencies in critical listening rooms. Methods to control these problems range from simple loudspeaker/listener positioning to quite complex digital signal processing. Nonetheless, the subjective importance and impact of these methods has rarely been quantified...
Nowadays there is an explosion of home music production activities such as recording, mixing and
mastering, which usually takes place in small spaces, which are not ideal for such tasks. Their
reduced volume and the typical construction structure of homes, which usually have the common
factor of strong walls facilitates the formation of standing wa...
A system has been investigated for the detection of incoming direction of an emergency vehicle. Acoustic detection methods based on a cross microphone array have been implemented. It is shown that source detection based on time delay estimation outperforms sound intensity techniques, although both techniques perform well for the application. The re...
This paper presents an investigation into the diagnosis of transmission belt condition through acoustic monitoring. A relevant belt model and laser interferometry measurements are used to guide the design and analysis of the acoustic monitoring system. The fault under scrutiny is the development of a loss of tension in the belt which may occur due...
In this paper, a new method is proposed by combining ensemble empirical mode decomposition (EEMD) with order tracking techniques to analyse the vibration signals from a two stage helical gearbox. The method improves EEMD results in that it overcomes the potential deficiencies and achieves better order spectrum representation for fault diagnosis. Ba...
This paper presents the use of the induction motor current to identify and quantify common faults within a two-stage reciprocating compressor. The theoretical basis is studied to understand current signal characteristics when the motor undertakes a varying load under faulty conditions. Although conventional bispectrum representation of current sign...
Spurious noise in car cabinet can be not only annoying bust also indicative of some potential faults. A small square microphone array with 4 sensors was adopted in this paper to localize the sound source in car for fault diagnosis. A new voice activity detection (VAD) algorithm was proposed for the typical discontinuous short-time noise in car due...
Attempts have long been made to classify a room's low frequency audio reproduction capability with regards
to its aspect ratio. Common metrics used have relied on the homogeneous distribution of modal frequencies and from these a number of 'optimal' aspect ratios have emerged. However, most of these metrics ignore the source and receiver coupling t...
This paper introduces a new method for the estimation of sound source distance and direction using at least three microphone sensors in indoor environments. Unlike the other methods that normally use approximations in obtaining the time difference between sensors, this method exploits the existed geometrical relationships of the sensors to form an...
Timing belt transmission is a key subsystem of international combustion engines. Faults in such belt systems lead to power loss, increased emissions and, in case of failure, may even cause severe damage to the whole engine. Hitherto, the physical condition of the belt has been assessed manually, which is both inconvenient and inaccurate. It is well...