Jakob Abeßer

Jakob Abeßer
Fraunhofer Institute for Digital Media Technology IDMT | IDMT · Semantic Music Technologies Research Group

Dr.-Ing.

About

89
Publications
29,706
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
712
Citations

Publications

Publications (89)
Conference Paper
The deployment of machine listening algorithms in real-world application scenarios is challenging. In this paper, we investigate how the superposition of multiple sound events within complex sound scenes affects their recognition. As a basis for our research, we introduce the Urban Sound Monitoring (USM) dataset, which is a novel public benchmark d...
Preprint
The deployment of machine listening algorithms in real-life applications is often impeded by a domain shift caused for instance by different microphone characteristics. In this paper, we propose a novel domain adaptation strategy based on disentanglement learning. The goal is to disentangle task-specific and domain-specific characteristics in the a...
Preprint
Full-text available
In the context of music information retrieval, similarity-based approaches are useful for a variety of tasks that benefit from a query-by-example scenario. Music however, naturally decomposes into a set of semantically meaningful factors of variation. Current representation learning strategies pursue the disentanglement of such factors from deep re...
Preprint
Full-text available
The deployment of machine listening algorithms in real-life applications is often impeded by a domain shift caused for instance by different microphone characteristics. In this paper, we propose a novel domain adaptation strategy based on disentanglement learning. The goal is to disentangle task-specific and domain-specific characteristics in the a...
Conference Paper
The development of robust acoustic traffic monitoring (ATM) algorithms based on machine learning faces several challenges. The biggest challenge is to collect and annotate a suitable data set for model training and evaluation, which must reflect a broad variety of vehicle sounds since their emitted acoustic noise patterns depend on a variety of fac...
Article
The development of robust acoustic traffic monitoring (ATM) algorithms based on machine learning faces several challenges. The biggest challenge is to collect and annotate large high-quality datasets for algorithm training and evaluation. Such a dataset must reflect a broad variety of vehicle sounds since their emitted acoustic noise patterns depen...
Conference Paper
Musicological studies on jazz performance analysis commonly require a manual selection and transcription of improvised solo parts, both of which can be time-consuming. In order to expand these studies to larger corpora of jazz recordings, algorithms for automatic content analysis can accelerate these processes. In this study, we aim to detect the p...
Conference Paper
In many urban areas, traffic load and noise pollution are constantly increasing. Automated systems for traffic monitoring are promising countermeasures, which allow to systematically quantify and predict local traffic flow in order to to support municipal traffic planning decisions. In this paper, we present a novel open benchmark dataset, containi...
Preprint
Full-text available
This paper introduces a novel dataset for polyphonic sound event detection in urban sound monitoring use-cases. Based on isolated sounds taken from the FSD50k dataset, 20,000 polyphonic soundscapes are synthesized with sounds being randomly positioned in the stereo panorama using different loudness levels. The paper gives a detailed discussion of p...
Preprint
Full-text available
In many urban areas, traffic load and noise pollution are constantly increasing. Automated systems for traffic monitoring are promising countermeasures, which allow to systematically quantify and predict local traffic flow in order to to support municipal traffic planning decisions. In this paper, we present a novel open benchmark dataset, containi...
Article
Full-text available
In this work, we propose considering the information from a polyphony for multi-pitch estimation (MPE) in piano music recordings. To that aim, we propose a method for local polyphony estimation (LPE), which is based on convolutional neural networks (CNNs) trained in a supervised fashion to explicitly predict the degree of polyphony. We investigate...
Article
Full-text available
In this paper, we adapt a recently proposed U-net deep neural network architecture from melody to bass transcription. We investigate pitch shifting and random equalization as data augmentation techniques. In a parameter importance study, we study the influence of the skip connection strategy between the encoder and decoder layers, the data augmenta...
Chapter
Electroacoustic music is experienced primarily through auditory perception, as it is not usually based on a prescriptive score. For the analysis of such pieces, transcriptions are sometimes created to illustrate events and processes graphically in a readily comprehensible way. These are usually based on the spectrogram of the recording. Although th...
Preprint
Full-text available
Research on sound event detection (SED) in environmental settings has seen increased attention in recent years. Large amounts of (private) domestic or urban audio data raise significant logistical and privacy concerns. The inherently distributed nature of these tasks, make federated learning (FL) a promising approach to take advantage of large-scal...
Conference Paper
Full-text available
In this paper, we investigate a previously proposed algorithm for spoken language identification based on convolutional neural networks and convolutional recurrent neural networks. We improve the algorithm by modifying the training strategy to ensure equal class distribution and efficient memory usage. We successfully replicate previous experimenta...
Chapter
In this paper, we approach the problem of detecting segments of singing voice activity in opera recordings. We consider three state-of-the-art methods for singing voice detection based on supervised deep learning. We train and test these models on a novel dataset comprising three annotated performances (versions) of Richard Wagner’s opera “Die Walk...
Article
Full-text available
The number of publications on acoustic scene classification (ASC) in environmental audio recordings constantly increased over the last years. This was mainly stimulated by the annual Detection and Classification of Acoustic Scenes and Events (DCASE) competition with its first edition in 2013. All competitions so far involved one or multiple ASC tas...
Article
Die objektivierte Messung subjektiven Lärmempfindens kann helfen, Lärmentstehung präziser nachvollziehen zu können. Im Forschungsprojekt „StadtLärm“ wurde dafür ein erstes technisches System entwickelt. Sensoren messen hier nicht nur die bekannten Lärmpegel, sondern identifizieren gleichzeitig die Geräuschklasse, also die Herkunft des Geräusches. E...
Conference Paper
Full-text available
Western classical music comprises a rich repertoire composed for different ensembles. Often, these ensembles consist of instruments from one or two of the families wood-winds, brass, piano, vocals, and strings. In this paper, we consider the task of automatically recognizing instrument families from music recordings. As one main contribution , we i...
Conference Paper
Full-text available
Electroacoustic music is experienced primarily through hearing, as it is not usually based on a prescriptive score. For the analysis of such pieces, transcriptions are sometimes created to illustrate events and processes graphically in a readily comprehensible way. These are usually based on the spectrogram of the recording. Although transcriptions...
Preprint
Full-text available
This is a late-breaking demo session submission. It describes a side-project I've been working on since this year. I describe the concept of a music learning platform for bass players and drummers, results of an initial user study as well as an early app prototype.
Conference Paper
Full-text available
In this paper, we build upon a recently proposed deep convolutional neural network architecture for automatic chord recognition (ACR). We focus on extending the commonly used major/minor vocabulary (24 classes) to an extended chord vocabulary of seven chord types with a total of 84 classes. In our experiments, we compare joint and separate classifi...
Conference Paper
Full-text available
In this paper, we evaluate hand-crafted features as well as features learned from data using a convolutional neural network (CNN) for different fundamental frequency classification tasks. We compare classification based on full (variable-length) contours and classification based on fixed-sized subcontours in combination with a fusion strategy. Our...
Conference Paper
Full-text available
For musicological studies on large corpora, the compilation of suitable data constitutes a time-consuming step. In particular, this is true for high-quality symbolic representations that are generated manually in a tedious process. A recent study on Western classical music has shown that musical phenomena such as the evolution of tonal complexity o...
Conference Paper
Predominant instrument recognition in ensemble recordings remains a challenging task, particularly if closely-related instruments such as alto and tenor saxophone need to be distinguished. In this paper, we build upon a recently-proposed instrument recognition algorithm based on a hybrid deep neural network: a combination of convolu-tional and full...
Conference Paper
In this paper, we consider two methods to improve an algorithm for bass saliency estimation in jazz ensemble recordings which are based on deep neural networks. First, we apply label propagation to increase the amount of training data by transferring pitch labels from our labeled dataset to unlabeled audio recordings using a spectral similarity mea...
Article
Full-text available
Retrieving short monophonic queries in music recordings is a challenging research problem in Music Information Retrieval (MIR). In jazz music, given a solo transcription, one retrieval task is to find the corresponding (potentially polyphonic) recording in a music collection. Many conventional systems approach such retrieval tasks by first extracti...
Article
Full-text available
Web services allow permanent access to music from all over the world. Especially in the case of web services with user-supplied content, e.g., YouTube™, the available metadata is often incomplete or erroneous. On the other hand, a vast amount of high-quality and musically relevant metadata has been annotated in research areas such as Music Informat...
Chapter
The use of pitch-informed solo and accompaniment separation as a tool for the creation of practice content
Conference Paper
Motivated by the recent success of deep learning techniques in various audio analysis tasks, this work presents a distributed sensor-server system for acoustic scene classification in urban environments based on deep convolutional neural networks (CNN). Stacked autoencoders are used to compress extracted spectrogram patches on the sensor side befor...
Conference Paper
In this paper, we focus on transcribing walking bass lines, which provide clues for revealing the actual played chords in jazz recordings. Our transcription method is based on a deep neural network (DNN) that learns a mapping from a mixture spectrogram to a salience representation that emphasizes the bass line. Furthermore, using beat positions, we...
Article
This paper deals with the automatic transcription of solo bass guitar recordings with an additional estimation of playing techniques and fretboard positions used by the musician. Our goal is to first develop a system for a robust estimation of the note parameters pitch, onset, and duration (score-level parameters). As a second step, we aim to autom...
Article
Both the collection and analysis of large music repertoires constitute major challenges within musicological disciplines such as jazz research. Automatic methods of music analysis based on audio signal processing have the potential to assist researchers and to accelerate the transcription and analysis of music recordings significantly. In this pape...
Conference Paper
The audio mixing process is an art that has proven to be extremely hard to model: What makes a certain mix better than another one? How can the mixing processing chain be automatically optimized to obtain better results in a more efficient manner? Over the last years, the scientific community has exploited methods from signal processing, music info...
Article
Full-text available
The metaphor of storytelling is widespread among jazz performers and jazz researchers. However, little is known about the precise meaning of this metaphor on an analytical level. The present paper attempts to shed light on the connected semantic field of the metaphor and relate it to its musical basis by investigating time courses of selected music...
Article
Full-text available
We present a novel approach to the analysis of jazz solos based on the categorisation and annotation of musical units on a middle level between single notes and larger form parts. A guideline during development was the hypothesis that these midlevel units (MLU) correspond to the improvising musicians’ playing ideas and action plans. A system of cat...
Patent
Tone input device having a tone signal input, a tone signal output and a sound classifier connected to the tone signal input for receiving a tone signal incoming at the tone signal input and for analyzing the tone signal for identifying, within the tone signal, one or several tone signal passages corresponding to at least one condition. Further, th...
Conference Paper
Full-text available
The paper presents new approaches for analyzing the characteristics of intonation and pitch modulation of woodwind and brass solos in jazz recordings. To this end, we use score-informed analysis techniques for source separation and fundamental frequency tracking. After splitting the audio into a solo and a backing track, a reference tuning frequenc...
Conference Paper
Full-text available
In this paper, we aim at analyzing the use of dynamics in jazz improvisation by applying score-informed source separation and automatic estimation of note intensities. A set of 120 jazz solos taken from the Weimar Jazz Database covering many different jazz styles was manually transcribed and annotated by musicology and jazz students within the Jazz...
Conference Paper
Full-text available
In this paper, we focus on the automatic classification of jazz records. We propose a novel approach where we break down the ambiguous task, which is commonly referred to as genre classification, into three more specific semantic levels. First, the rhythm feel (swing, latin, funk, two-beat) characterizes the basic groove organization and most often...
Conference Paper
In this paper, we propose an instrument-centered bass guitar transcription algorithm. Instead of aiming at a general-purpose bass transcription algorithm, we incorporate knowledge about the instrument construction and typical playing techniques of the electric bass guitar. In addition to the commonly extracted score-level parameters note onset, off...
Conference Paper
Full-text available
Instrument recognition is an important task in music information retrieval (MIR). Whereas the recognition of musical instruments in monophonic recordings has been studied widely, the polyphonic case still is far from being solved. A new approach towards feature-based instrument recognition is presented that makes use of redundancies in the harmonic...
Conference Paper
Full-text available
Over the past years, the detection of onset times of acoustic events has been investigated in various publications. However, to our knowledge, there is no research on event detection on a broader scale. In this paper, we introduce a method to automatically detect "big" events in music pieces in order to match them with events in videos. Furthermore...
Conference Paper
Full-text available
In this paper, we present a novel approach to real-time detection of the string number and fretboard position from polyphonic guitar recordings. Our goal is to assess, if a music student is correctly performing guitar exercises presented via music education software or a remote guitar teacher. We combine a state-of-the art approach for multi-pitch...
Conference Paper
In this paper, we propose a novel approach for music similarity estimation. It combines temporal segmentation of music signals with source separation into so-called tone objects. We solely use the timbre-related audio features Mel-Frequency Cepstral Coefficients (MFCC) and Octave-based Spectral Contrast (OSC) to describe the extracted tone objects....
Article
In this paper, we present a comparative study of three different classification paradigms for genre classification based on repetitive basslines. In spite of a large variety in terms of instrumentation, a bass instrument can be found in most music genres. Thus, the bass track can be analysed to explore stylistic similarities between music genres. W...
Conference Paper
In this paper, we present a feature-based approach to au-tomatically estimate the string number in recordings of the bass guitar and the electric guitar. We perform different experiments to evaluate the classification performance on isolated note recordings. First, we analyze how factors such as the instrument, the playing style, and the pick-up se...
Conference Paper
In this paper, we present a novel audio synthesis model that allows us to simulate bass guitar tones with 11 different playing techniques to choose from. In contrast, previous approaches focussing on bass guitar synthesis only implemented the two slap techniques. We apply a digital waveguide model extended by different modular parts to imitate the...
Chapter
Full-text available
This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commer-cially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are d...