David DorranTechnological University Dublin | TU Dublin · School of Electrical and Electronic Engineering
David Dorran
Ph.D., BSc. (Electrical/Electronic Engineering) PgDip (3rd Level Learning & Teaching)
About
44
Publications
27,210
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
230
Citations
Publications
Publications (44)
This document focuses on showing how the z-transform is typically used by engineers. It includes lots of example code and worked examples. In addition it provides the numerous sections/chapters that provide further insight into how and why the z-transform works in practice.
Provides a gentle introduction to the frequency-domain view of signals and negative frequency
This document provides a practical overview of how to design and implement filters with a focus on using Octave and Matlab
In this paper, an algorithm designed to detect characteristic cough events in audio recordings is presented, significantly reducing the time required for manual counting. Using time-frequency representations and independent subspace analysis (ISA), sound events that exhibit characteristics of coughs are automatically detected, producing a summary o...
Online simulations created using HTML, JavaScript and SVG images can beaccessed by any end user with a web browser, making them widely accessible andcompatible with computers running Microsoft, macOS or Linux-based operatingsystems. This paper outlines the methods and tools used by the authors to createinteractive online educational simulations usi...
These notes on the Discrete Fourier Transform include numerous practical examples that make use of audio signals.
Link to arXiv - https://arxiv.org/abs/2104.06798
In this paper, an algorithm designed to detect characteristic cough events in audio recordings is presented, significantly reducing the time required for manual counting. Using time-frequency representations and independent subspace analysis (ISA), sound events that exhibit characteristics of coughs...
Cough sounds act as an important indicator of an individual's physical health, often used by medical professionals in diagnosing a patient's ailments. In recent years progress has been made in the area of automatically detecting cough events and, in certain cases, automatically identifying the ailment associated with a particular cough sound. Ethic...
While delivering a module on digital signal processing a series of one-to-one interviews were used extensively to assess undergraduate students. The interviews were organised so as to encourage students to focus on fundamentals before attempting to deal with more complex concepts. Feedback from the students about the process was extremely positive...
The ARX-LF model interprets voiced speech as the an LF derivative glottal pulse exciting an all-pole vocal tract filter with an additional exogenous residual signal. It fully parameterizes the voice and has been shown to be useful for voice modification. Because time domain methods to determine the ARX-LF parameters from speech are very sensitive t...
It is crucial for many methods of inverse filtering that the time domain information of the glottal source waveform is known, e.g. the location of the instant of glottal closure. It is often the case that this information is unknown and/or cannot be determined due to e.g. recording conditions which can corrupt the phase spectrum. In these scenarios...
An approach is presented which generates an audio thumbnail of Irish Traditional music. An audio thumbnail is considered to be the most representative segment of the music. For popular music, the chorus is considered to be an ideal audio thumbnail, however in Irish Traditional music there is no chorus. An Irish Traditional tune consists of two or m...
This paper presents an approach to estimate the glottal formant parameters of the voicing source in the frequency-domain. The method is based on a simplified pole-zero interpretation of the prevalent Liljencrants-Fant (LF) model of glottal flow, and gives approximations for a broad range of pulses shapes. An advantage of the method is that, unlike...
A framework is presented which addresses the issues related to the real-time implementation of synchronized video and audio time-scale and pitch-scale modification algorithms. It allows for seamless real-time transition between continually varying, independent time-scale and pitch-scale parameters arising as a result of manual or automatic interven...
An approach is presented which provides the tune change loactions within a set of Irish traditional turnes. Also provided are semantic labels for each part of each tune within the set. A set in Irish traditional music is a number of individual tunes played segue. Each of the tunes in the set are made up of structural segments called parts. Musical...
Often when performing glottal closed phase covariance linear pre-diction, a positive real pole can appear in the resulting filter transfer function. The commonly adopted approach is to discard this pole, as it does not fit with the usual model of the all-pole vocal tract filter. However, this real pole describes some aspect of the speech signal; th...
Current advances in spoken interface design point towards a shift towards more "human-like" interaction, as opposed to the traditional "push-to-talk" approach. However, human dialogue is characterized by synchrony and multi-modality, and these properties are not captured by traditional representation approaches, such as turn succession. This paper...
This paper presents ongoing research on convergence of speech features in human dialogues, in view of simulating this behaviour in spoken dialogue systems. The TAMA method (time-aligned moving average), previously used on monitoring convergence of acoustic prosodic (a/p) features, is applied to temporal properties of speech (between-turn pauses and...
Acoustic/prosodic feature (a/p) convergence has been known to occur both in dialogues between humans, as well as in human-computer interactions. Understanding the form and function of convergence is desirable for developing next generation conversational agents, as this will help increase speech recognition performance and naturalness of synthesize...
This paper addresses the problem of pitch tracking and voiced/unvoiced detection in noisy speech environments. An algorithm is presented which uses a number of variable thresholds to track pitch contour with minimal error. This is achieved by modeling the pitch tracking problem in such a way that allows the use of optimal estimation methods, such M...
A method for single channel source separation is proposed in this paper, which uses estimates of the delay co-efficient of individual sources within an echoic mixture using autocorrelation, following which a “pseudo-stereo mixture” is generated, to which the ADRess algorithm can be applied. The system is evaluated in a theoretical situation, where...
Linear prediction is a signal processing technique that is used extensively in the analysis of speech signals and, as it is so heavily referred to in speech processing literature, a certain level of familiarity with the topic is typically required by all speech processing engineers. This paper aims to provide a well-rounded introduction to linear p...
A framework is presented which is designed to address the issues related to the real-time implementation of time-scale and pitch scale modification algorithms. This framework can be used as the basis for the developments of applications which allow for a seamless real-time transition between continually varying time scale and pitch-scale parameters...
An approach which efficiently segments Irish Traditional Music into its constituent structural segments is presented. The complexity of the segmentation process is greatly increased due to melodic variation existent within this music type. In order to deal with these variations, a novel method using ‘set accented tones’ is introduced. The premise i...
Time-domain appraoches to time-scale modification are popular due to their ability to produce high quality results at a relatively low computational cost. Within the category of time-domain implementations quite a number of alternatives exist, each with their own computational requirements and associated output quality. This paper provides a comput...
Frequency-domain approaches to audio time-scale modification introduce a reverberant artifact into the time-scaled output due to a loss in phase coherence between subband components. Whilst techniques have been developed which reduce the presence of this artifact, it remains a source of difficulty. A method of time-scaling is presented that reduces...
Audio time-scale modification is an audio effect that alters the duration of an audio signal without affecting its perceived local pitch and timbral characteristics. There are two broad categories of time-scale modifications algorithms, time-domain and frequency-domain. The computationally efficient time-domain techniques produce high quality resul...
Time-domain audio time-scaling algorithms are efficient in comparison to their frequency-domain counterparts, but they rely upon the existence of a quasi-periodic signal to produce a high quality output. This requirement makes them unsuitable for direct application to complex multi-pitched signals such as polyphonic music. However, it has been show...
Time-domain time-scaling algorithms are efficient in comparison to their frequency-domain counterparts, but they rely upon the existence of a quasi-periodic signal to produce a high quality output. This requirement makes them unsuitable for use on multi-pitched signals such as polyphonic music. However, time-domain techniques applied on a subband b...
The duration of a speech passage can be altered using audio time-scale modification techniques. Time-scale modification can be achieved in the time domain by segmenting the input signal into overlapping frames and recombining the frames with an overlap differing from the analysis overlap. We present a time-scale modification algorithm that uses a s...
The PAOLA algorithm is an efficient algorithm for the time-scale modification of speech. It uses a simple peak alignment technique to synchronise synthesis frames and takes waveform properties and the desired time-scale factor into account to determine optimum algorithm parameters. However, PAOLA has difficulties with certain waveform types and can...
Time-scale modification of speech using a synchronised and adaptive overlap-add (SAOLA) algorithm. 114th. Audio Engineering Society Convention, paper#5834, Amsterdam, the Netherlands, 2003. ABSTRACT The synchronised overlap-add (SOLA) algorithm is a commercially popular and considerably researched audio time-scale modification technique. It operate...
For both engineers and linguists, the computer synthesis of natural speech is an objective that would provide many useful applications to human-computer interaction, including the realm of electro-acoustic music. The purpose of this paper is to introduce the area of speech synthesis by providing an overview of the three main methods of computer spe...
This paper profiles the emergence of a significant body of research in audio engineering within the Faculties of Engineering and Applied Arts at Dublin Institute of Technology. Over a period of five years the group has had significant success in completing a Strand 3 research project entitled Digital Tools for Music Education (DiTME), followed by s...
Phase vocoder approaches to time-scale modification of audio introduce a reverberant/phasy artifact into the time-scaled output due to a loss in phase coherence between short-time Fourier transform (STFT) bins. Recent improvements to the phase vocoder have reduced the presence of this artifact, however, it remains a problem. A method of time-scalin...
This paper profiles the emergence of a significant body of research in audio engineering within the Faculties of Engineering and Applied Arts at Dublin Institute of Technology. Over a period of five years the group has had significant success in completing a Strand 3 research project entitled Digital Tools for Music Education (DiTME).
Phase vecoder based approaches to audio time-scale modification introduce a reverberant artefact into the time scaled output. Recent techniques have been developed to reduce the presence of this artefact; however, these techniques have the effect of introducing additional issues relating to their application to multi-channel recordings. This paper...
An approach is presented which provides a structural segmentation of Irish Traditional Music. Chroma information is extracted at certain locations within the music. The resulting chroma vectors are compared to determine similar structural segments. Chroma is only calculated at "set accented tone" locatins within the music. "Set accented tones" are...
Convergence of acoustic/prosodic (a/p) features between two speakers is a well-known property of human dialogue. It has been suggested that this particular aspect of human interaction should be implemented in spoken dialogue systems, so that they can be perceived as more “humanlike”. This paper presents a quantitative analysis method that can provi...
As a result of the modern phenomenon of globalization, accrediting agencies and employers alike are emphasizing the importance of non-technical (also called key, transferable, or generic) skills (critical thinking as in design, group skills, and communication skills) in engineering education in addition to the traditional technical skills. While th...