About
79
Publications
40,298
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,554
Citations
Introduction
Skills and Expertise
Publications
Publications (79)
We introduce a novel method for movie genre classification, capitalizing on a diverse set of readily accessible pretrained models. These models extract high-level features related to visual scenery, objects, characters, text, speech, music, and audio effects. To intelligently fuse these pretrained features, we train small classifier models with low...
We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting...
We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting...
We introduce a novel method for movie genre classification, capitalizing on a diverse set of readily accessible pretrained models. These models extract high-level features related to visual scenery, objects, characters, text, speech, music, and audio effects. To intelligently fuse these pretrained features, we train small classifier models with low...
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with e...
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with e...
The 11th UbiMus — Ubiquitous Music Workshop (https://dei.fe.up.pt/ubimus/) was held at the Center for High Artistic Performance, the house of the Orquestra Jazz Matosinhos (OJM) in Portugal, during September 6–8, 2021. It was organized by the Sound and Music Computing (SMC) Group of the Faculty of Engineering, University of Porto and INESC TEC, Por...
In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output. Our main contribution centers on...
In this paper, we present TIV.lib, an open-source library for the content-based tonal description of musical audio signals. Its main novelty relies on the perceptually-inspired Tonal Interval Vector space based on the Discrete Fourier transform, from which multiple instantaneous and global representations, descriptors and metrics are computed-e.g.,...
In this paper, we present TIV.lib, an open-source library for the content-based tonal description of musical audio signals. Its main novelty relies on the perceptually-inspired Tonal Interval Vector space based on the Discrete Fourier transform, from which multiple instantaneous and global representations, descriptors and metrics are computed - e.g...
This research proposes a new strategy for interacting with large web audio databases, aiming to foster creative processes inscribed in the epistemological paradigm connected to the growing use of databases as live, dynamic and collaborative repositories that can foster new sound design practices. A use case for assisting the generation of soundscap...
The goal of this work is to develop an application that enables music producers to use their voice to create drum patterns when composing in Digital Audio Workstations (DAWs). An easy-to-use and user-oriented system capable of automatically transcribing vocalisations of percussion sounds, called LVT - Live Vocalised Transcription, is presented. LVT...
We address the task of advertisement detection in broadcast television content. While typically approached from a video-only or audio-visual perspective, we present an audio-only method. Our approach centres on the detection of short silences which exist at the boundaries between programming and advertising, as well as between the advertisements th...
We present a hierarchical harmonic mixing method for assisting users in the process of music mashup creation. Our main contributions are metrics for computing the harmonic compatibility between musical audio tracks at small- and large-scale structural levels, which combine and reassess existing perceptual relatedness (i.e., chroma vector similarity...
We present a novel model for the characterization of musical rhythms that is based on the pervasive rhythmic phenomenon of syncopation. Syncopation is felt when the sensation of the regular beat or pulse in the music is momentarily disrupted; the feeling arises from breaking more expected patterns such as pickups (anacrusis) and faster events that...
We present a method for assisting users in the process of music mashup creation. Our main contribution is a harmonic compatibility metric between musical audio samples which combines existing perceptual relatedness (i.e., chroma vectors or key anity) and conso-nance approaches. Our harmonic compatibility metric is derived from Tonal Interval Space...
In this paper we present the INESC Key Detection (IKD) system which incorporates a novel method for dynamically biasing key mode estimation using the spatial displacement of beat-synchronous Tonal Interval Vectors (TIVs). We evaluate the performance of the IKD system at finding the global key on three annotated audio datasets and using three key-de...
We present D’accord, a generative music system for creating harmonically compatible accompaniments of symbolic and musical audio inputs with any number of voices, instrumentation and complexity. The main novelty of our approach centers on offering multiple ranked solutions between a database of pitch configurations and a given musical input based o...
In this paper we present SEED, a generative system capable of arbitrarily extending recorded environmental sounds while preserving their inherent structure. The system architecture is grounded in concepts from concatena-tive sound synthesis and includes three top-level modules for segmentation, analysis, and generation. An input audio signal is fir...
In this paper we present a 12-dimensional tonal space in the context of the Tonnetz, Chew’s Spiral Array, and Harte’s 6-dimensional Tonal Centroid Space. The proposed Tonal Interval Space is calculated as the weighted Discrete Fourier Transform of normalized 12-element chroma vectors, which we represent as six circles covering the set of all possib...
The practice of harmonic mixing is a technique used by DJs for the beat-synchronous and harmonic alignment of two or more pieces of music. In this paper, we present a new harmonic mixing method based on psychoacoustic principles. Unlike existing commercial DJ-mixing software, which determines compatible matches between songs via key estimation and...
We present Conchord, a system for real-time automatic generation of musical harmony through navigation in a novel 12-dimensional Tonal Interval Space. In this tonal space, angular and Euclidean distances among vectors representing multi-level pitch configurations equate with music theory principles, and vector norms acts as an indicator of consonan...
The practice of harmonic mixing is a technique used by DJs for the beat-synchronous and harmonic alignment of two or more pieces of music. In this paper, we present a new harmonic mixing method based on psychoacoustic principles. Unlike existing commercial DJ-mixing software which determine compatible matches between songs via key estimation and ha...
This paper presents a computational toolkit for the real-time and offline analysis of audio signals in Pure Data. Specifically, the toolkit encompasses tools to identify sound objects from an audio stream and to describe sound objects attributes adapted to music analysis and composition. The novelty of our approach in comparison to existing audio d...
In this paper we present a system, AutoMashUpper, for making multi-song music mashups. Central to our system is a measure of “mashability” calculated between phrase sections of an input song and songs in a music collection. We define mashability in terms of harmonic and rhythmic similarity and a measure of spectral balance. The principal novelty in...
This paper examines the correlation between musical dissonance and auditory roughness—the most significant factor of psychoacoustic dissonance—and the contribution of the latter to algorithmic composition. We designed an empirical study to assess how auditory roughness cor- relates with human judgments of dissonance in natural musical stimuli on th...
In order to better understand the musical properties which elicit an increased sensation of wanting to move when listening to music—groove—we investigate the effect of adding syncopation to simple piano melodies, under the hypothesis that syncopation is correlated to groove. Across two experiments we examine listeners' experience of groove to synth...
In this work we present a system that estimates and ma-nipulates rhythmic structures from audio loops in real-time to perform syncopation transformations. The core of our system is a technique for the manipulation of synco-pation in symbolic representations of rhythm. In order to apply this technique to audio signals we must first seg-ment the audi...
This demo presents a Spotify App of the RAMA system that aims to improve the way music recommendations and artist relations are shown in Spotify. This new version includes the majority of the existing features from RAMA including: editing the visualization parameters (to obtain more detailed graphs); adding and removing node artists (for users wish...
In this article we propose a generative music model that recombines heterogeneous corpora of audio units on both horizontal and vertical dimensions of musical structure. In detail, we describe a system that relies on algorithmic strategies from the field of music information retrieval—in particular content-based audio processing strategies—to infer...
This paper presents a drum transcription algorithm adjusted to the constraints of real-time audio. We introduce an instance filtering (IF) method using sub-band onset detection, which improves the performance of a system having at its core a feature-based K-nearest neighbor classifier (KNN). The architecture proposed allows for adapting different p...
GROOVE IS A SENSATION OF MOVEMENT OR WANTing to move when we listen to certain types of music; it is central to the appreciation of many styles such as Jazz, Funk, Latin, and many more. To better understand the mechanisms that lead to the sensation of groove, we explore the relationship between groove and systematic microtiming deviations. Manifest...
What are the properties of sound signals that induce the experience of groove in listeners? Groove is often described as the experience of music that makes people tap their feet and want to dance. The ShakeIt project looks into systematic patterns of signal properties (timing, metrical, dynamic, etc.) and relates them to the experience of groove. T...
This study addresses the relationship between syncopation and groove in simple piano melodies. The principal motivation is to test whether creating syncopations in simple melodies increases the perception of groove in listeners. The basic stimuli comprised 10 simple piano melodies (around 20 s in duration), synthesized using MIDI at a fixed tempo o...
In this paper we propose an audio beat tracking system, IBT, for multiple applications. The proposed system integrates an automatic monitoring and state recovery mechanism, that applies (re-)inductions of tempo and beats, on a multi-agent-based beat tracking architecture. This system sequentially processes a continuous onset detection function whil...
In this paper, we propose a method that can identify challenging music samples for beat tracking without ground truth. Our method, motivated by the machine learning method “selective sampling,” is based on the measurement of mutual agreement between beat sequences. In calculating this mutual agreement we show the critical influence of different eva...
Rather than analyse an audio signal directly, many beat tracking algorithms perform some transformation of the input, commonly using note onset times or apply a mid-level representation which emphasises them, as the basis for extracting beat times. In this paper we investigate the importance of the input representation by comparing seven different...
In this paper, we present a novel approach to beat tracking eval-uation, based on finding the error between automatically gener-ated beat locations and ground truth annotations. The error is normalised to the current inter-annotation-interval, such that the greatest observable error can be ± 50% of a beat. We form a histogram of normalised beat err...
In this paper, an approach is presented that identifies music samples which are difficult for current state-of-the-art beat trackers. In order to estimate this difficulty even for examples without ground truth, a method motivated by selective sampling is applied. This method assigns a degree of difficulty to a sample based on the mutual disagreemen...
A new probabilistic framework for beat tracking of musical audio is presented. The method estimates the time be- tween consecutive beat events and exploits both beat and non-beat information by explicitly modeling non-beat states. In addition to the beat times, a measure of the expected accuracy of the estimated beats is provided. The quality of th...
In this paper we establish a threshold for perceptually ac-ceptable beat tracking based on the mutual agreement of a committee of beat trackers. In the first step we use an ex-isting annotated dataset to show that mutual agreement can be used to select one committee member as the most reli-able beat tracker for a song. Then we conduct a listening t...
Hardcore, jungle, and drum and bass (HJDB) are fast-paced electronic dance music genres that often employ resequenced breakbeats or drum samples from jazz and funk percussionist solos. We present a style-specific method for downbeat detection specifically designed for HJDB. The presented method combines three forms of metrical information in the pr...
In this paper, we propose a rhythmically informed method for onset detection in polyphonic music. Music is highly structured in terms of the temporal regularity underlying onset occurrences and this rhythmic structure can be used to locate sound events. Using a probabilistic formulation, the method integrates information extracted from the audio si...
We present a new evaluation method for measuring the performance of musical audio beat tracking systems. Central to our method is a novel visualization, the beat error histogram, which illustrates the metrical relationship between two qausi-periodic sequences of time instants: the output of beat tracking system and a set of ground truth annotations...
SMALLbox is a new foundational framework for processing signals, using adaptive sparse structured representations. The main
aim of SMALLbox is to become a test ground for exploration of new provably good methods to obtain inherently data-driven sparse
models, able to cope with large-scale and complicated data. The toolbox provides an easy way to ev...
In this paper we explore the relationship between the temporal and rhythmic structure of musical audio signals. Using automatically extracted rhythmic structure we present a rhythmically-aware method to combine note onset detection techniques. Our method uses top-down knowledge of repetitions of musical events to improve detection performance by mo...
We present a new method for generating input features for musical audio beat tracking systems. To emphasise periodic structure we derive a weighted linear combination of sub-band onset detection functions driven a measure of sub-band beat strength. Results demonstrate improved performance over existing state of the art models, in particular for mus...
A fundamental research topic in music information retrieval is the automatic extraction of beat locations from music signals. In this paper we address the under-explored topic of beat tracking evaluation. We present a review of existing evaluation models and, given their strengths and weaknesses, we propose a new method based on a novel visualisati...
A fundamental research topic in music information retrieval is the automatic extraction of beat locations from music signals. In this paper we address the under-explored topic of beat tracking evaluation. We present a review of existing evaluation models and, given their strengths and weaknesses, we propose a new method based on a novel visualisati...
In this paper we present a model for beat-synchronous anal-ysis of musical audio signals. Introducing a real-time beat track-ing model with performance comparable to offline techniques, we discuss its application to the analysis of musical performances segmented by beat. We discuss the various design choices for beat-synchronous analysis and their...
Time-scale transformations of audio signals have traditionally relied exclusively upon manipulations of tempo. We present a novel technique for automatic mixing and synchronization between two musical signals. In this transformation, the original signal as-sumes the tempo, meter, and rhythmic structure of the model sig-nal, while the extracted down...
We outline a set of audio effects that use rhythmic analysis, in particular the extraction of beat and tempo information, to automatically synchronise temporal parameters to the input signal. We demonstrate that this analysis, known as beat-tracking, can be used to create adaptive parameters that adjust themselves according to changes in the proper...
Within ballroom dance music, tempo and rhythmic style are strongly related. In this paper we explore this relationship, by using knowledge of rhythmic style to improve tempo estimation in musical audio signals. We demonstrate how the use of a simple 1-NN classification method, able to determine rhythmic style with 75% accuracy, can lead to an 8% po...
We present a new group of audio effects that use beat tracking, the detection of beats in an audio signal, to relate effect parameters to the beats in an input signal. Conventional audio effects are augmented so that their operation is related to the output of a beat tracking system. We present a tempo-synchronous delay effect and a set of beat syn...
Despite continued attention toward the problem of automatic beat detection in musical audio, the issue of how to evaluate beat tracking systems remains pertinent and controversial. As yet no consistent evaluation metric has been adopted by the research community. To this aim, we propose a new method for beat tracking evaluation by measuring beat ac...
We present a new class of digital audio effects which can automatically relate parameter values to the tempo of a musical input in real-time. Using a beat tracking system as the front end, we demonstrate a tempo-dependent delay effect and a set of beat-synchronous low frequency oscillator (LFO) effects including auto-wah, tremolo and vibrato. The e...
We present a simple and efficient method for beat tracking of musical audio. With the aim of replicating the human ability of tapping in time to music, we formulate our approach using a two state model. The first state performs tempo induction and tracks tempo changes, while the second maintains contextual continuity within a single tempo hypothesi...
This is an extended analysis of eight different algorithms for musical tempo extraction and beat tracking. The algorithms participated in the 2006 Music Information Retrieval Evaluation eXchange (MIREX), where they were evaluated using a set of 140 musical excerpts, each with beats annotated by 40 different listeners. Performance metrics were const...
of my own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the helpful guidance and support of my supervisor, Dr Mark Plumbley. In this thesis we investigate the automatic extraction of rhythmic...
We introduce a method for detecting downbeats in musical audio given a sequence of beat times. Using musical knowl-edge that lower frequency bands are perceptually more im-portant, we find the spectral difference between band-limited beat synchronous analysis frames as a robust downbeat indi-cator. Initial results are encouraging for this type of s...
We present details of our submissions to the Audio Tempo Extraction and Audio Beat Tracking contests within MIREX 2006. The approach we adopt makes use of our existing beat tracking technique with a mod- ified tempo extraction stage, and with the provision of three different onset detection functions to act as input. For each onset detection functi...
In this paper we apply a two state switching model to the problem of audio based beat tracking. Our analysis is based around the generation and application of adaptively weighted comb filterbank structures to extract beat timing information from the midlevel representation of an input audio signal known as the onset detection function. We evaluate...
We provide an overview of two algorithms submitted for the Audio Tempo Extraction contest within MIREX 2005: A non-causal Matlab implementation (Davies submission) [4] and a real-time C implementation (Brossier submis-sion) [3]. Both algorithms extract a primary and sec-ondary tempo, associated phase values for each tempo and the relative salience...
We introduce a causal approach to tempo tracking for mu- sical audio signals. Our system is designed towards an eventual real-time implementation; requiring minimal high- level knowledge of the musical audio. The tempo track- ing system is divided into two sections: an onset analysis stage, used to derive a rhythmically meaningful represen- tation...